Abstract: The accurate classification of subjective and objective sentences is important in the preparation for micro-blog sentiment analysis. Since a single feature type cannot provide enough subjective information for classification, we propose a Support vector machine (SVM)-based classification model for Chinese micro-blogs using multiple features. We extracted the subjective features from the Part of speech (POS) and the dependency relationship between words, and constructed a 3-POS subjective pattern set and a dependency template set. We fused these two types of features and used an SVM-based model to classify Chinese micro-blog text. The experimental results showed that the performance of the classification model improved remarkably when using multiple features.
Abstract: Existing decompilers use rule-based algorithms to transform unstructured Control flow graph (CFG) into equivalent high-level programming language constructs with "goto" statements. One problem of such approaches is that they generate a large number of "goto"s in the output code, which reduce the readability and hinder the understanding of input binaries. A global search algorithm is proposed based on structural analysis. This algorithm restructures a CFG and generates fewer number of "goto" statements than the rule-based algorithm does. We also present a Genetic algorithm (GA) for the global search approach to locate near optimal solutions for large CFGs. Evaluation results on a set of real CFGs show that the genetic algorithm-based heuristic for global search is capable of finding high-quality solutions.
Abstract: To solve the task of detecting and recounting events in videos with limited training examples, we propose a novel two-stage hybrid concept temporal pooling approach that is aware of potential concept drift in the video stream. We initially partition videos into temporal pyramids consisting of keyframes. Semantic concepts in keyframes is detected, which enables us to derive aggregated detection scores for each temporal pyramid using average-pooling and ultimately for the entire video via max-pooling. Owing to this refined hybrid pooling, our method yields more discriminative semantic representations with respect to the event query. We also develop an effective filtering strategy to cope with noisy concept detectors to robustify the textual description generation in recounting. Experiments on the large scale TRECVID MEDTest dataset demonstrate our method improves the accuracies over state-of-the-art methods, both for event detection and recounting.
Abstract: Code clones are similar code fragments appearing in software. As software evolves, code clones may be subjected to changes as well; we term this clone evolution. There have not been many investigations into clone evolution characteristics. Therefore, we tackle this by exploring useful information associated with changes of clones during evolution. We focus on three perspectives of clone evolution, ranging from individual clone changes to characterization of clone genealogies. With the help Xmeans clustering, we establish associations between clone changes and life of clones. Our experimental results on two softwares show that clones are mostly stable throughout software evolution. For the relatively smaller group of "unstable" clones, changes usually happen after several versions, and consistent changes appear more frequently than inconsistent ones. We suggest that developers should pay more attention to relatively longer genealogies, and should consider applying changes consistently to clone group when a constituent clone fragment has undergone change.
Abstract: Advances in quantum computation threaten to break public key cryptosystems such as RSA, ECC, and ElGamal that are based on the difficulty of factorization or taking a discrete logarithm, although up to now, no quantum algorithms have been found that are able to solve certain mathematical problems on non-commutative algebraic structures. Against this background, some novel public key cryptography based on Polynomial symmetrical decomposition (PSD) problem have been proposed. We find that these schemes are not secure. We present that they are vulnerable to structural attack, linearization equations attack, overdefined systems of multivariate polynomial equations attack and that, they only require polynomial time complexity to retrieve the same secret key for some given public keys respectively. We also propose an improvement to enhance public key cryptography based on PSD problem. In addition, we discuss possible lines of future work.
Abstract: As the Box-Jenkins method could not grasp the non-stationary characteristics of time series exactly, nor identify the optimal forecasting model order quickly and precisely, a self-adaptive processing and forecasting algorithm for univariate linear time series is proposed. A self-adaptive series characteristic test framework which employs varieties of statistic tests is constructed to solve the problem of inaccurate identification and inadequate processing for non-stationary characteristics of time series. To achieve favorable forecasts, an optimal forecasting model building algorithm combined with model filter and candidate model pool is proposed, in which a univariate linear time series forecasting model is built. Experimental data demonstrates that the proposed algorithm outperforms the comparative method in all forecasting performance statistics.
Abstract: Multi-media applications contain multibranches loop, it is of low efficiency to map them into traditional Single instruction multiple data (SIMD) structures. Considering the above matter, we proposed a multiinstruction streams extension method for traditional SIMD structures. The main idea is to simultaneously dispatch multiple instruction streams to multiple lanes. Compared with traditional SIMD whose lanes receive the unified single instruction stream but execute conditionally through a lane mask vector, Multi-instruction streams extension grants each of its lanes the ability to receive and execute the instructions of one particular branch path. Thus, it is of high efficiency to map multi-branches loop in applications. The design is finally implemented through Verilog language, and then integrated into the FT-Matrix vector-SIMD chip. Application profiling results shows that the proposed method consumes mere 2.61% area overhead while obtains about 1.8x to 2.4x performance gain.
Abstract: By exploring symmetric cryptographic data level and instruction-level parallelism, the reconfigurable processor architecture for symmetric ciphers is presented based on Very-long instruction word (VLIW) structure. The application-specific instruction-set system for symmetric ciphers is proposed. As for the same arithmetic operation of symmetric ciphers, eleven kinds of reconfigurable cryptographic arithmetic units are designed by the reconfigurable technology. As to the requirement of high energy-efficient design, the loop buffer structure for instruction fetching unit is proposed to reduce the power consumption significantly with the same frequency as conventional, meanwhile, the chain processing mechanism is proposed to improve the cryptographic throughput without any area overhead. It has been fabricated with 0.18μm CMOS technology. The result shows that the processor can work up to 200MHz, and the fourteen kinds of cryptographic algorithms were mapped in the processor, the encryption throughput of AES, SNOW2.0 and SHA2 algorithm can achieve 1.19Gbps, 1.05Gbps, and 407Mbps respectively.
Abstract: As the conventional feature selection algorithms are prone to the poor running efficiency in largescale datasets with interacting features, this paper aims at proposing a novel rough feature selection algorithm whose innovation centers on the layered co-evolutionary strategy with neighborhood radius hierarchy. This hierarchy can adapt the rough feature scales among different layers as well as produce the reasonable decompositions through exploiting any correlation and interdependency among feature subsets. Both neighborhood interaction within layer and neighborhood cascade between layers are adopted to implement the interactive optimization of neighborhood radius matrix, so that both the optimal rough feature selection subsets and their global optimal set are obtained efficiently. Our experimental results substantiate the proposed algorithm can achieve better effectiveness, accuracy and applicability than some traditional feature selection algorithms.
Abstract: It has been challenging to select suitable services from abundant candidates in cloud environment. Aiming to the characteristics of batch computing mode and stream computing mode, a novel trustworthy service selection approach is proposed integrating cloud model and interval numbers theory. To facilitate potential users to understand the quality of service, the trustworthiness of service is described with interval number using reverse cloud generator, and the services with poor performance are filtered out by employing deviation degree or proximity degree. Two formulas of possibility degree of interval numbers are designed to compare the trustworthiness values between cloud services by utilizing probability zone analysis and geometrical analysis respectively, and the ranking method for possibility degree of interval numbers is exploited to select the most trustworthy service. The experiments show that this approach is effective to improve the accuracy of service selection and select trustworthy service for potential users in cloud paradigm.
Abstract: A histogram shape based method is utilized to design watermarking algorithm to protect the Depthimage-based rendering (DIBR) 3D images. To make the watermarking method robust to common attacks, much more suitable pixel groups are selected for watermark embedding which are insensitive to geometric attacks. Taking into account the adjusting of baseline distance in DIBR process might affect the watermarking extraction, a postprocessing is proposed after DIBR process to make the proposed watermarking method much more robust to DIBR process. As the experimental results shown, the proposed method is much more robust to the geometric attacks and combined attacks compared with existing methods. In addition, the proposed watermarking method also has good robustness to the adjusting of baseline distance and depth image blurred.
Abstract: This paper concerns online solution of complex-valued linear matrix equations in the complex domain. Differing from the real-valued neural network, which is only designed for solving real-valued linear matrix equations in the real domain, a fully complex-valued Gradient neural network (GNN) is developed for computing complex-valued linear matrix equations. The fully complex-valued GNN model has the merit of reducing the unnecessary complexities in theoretical analysis and realtime computation, as compared to the real-valued neural network. Besides, the convergence analysis of the proposed complex-valued GNN model is presented, and simulation experiments are performed to substantiate the effectiveness and superiority of the proposed complex-valued GNN model for online computing the complex-valued linear matrix equations in the complex domain.
Abstract: FPGA based soft vector processing accelerators are used frequently to perform highly parallel data processing tasks. Since they are not able to implement complex control manipulations using software, most FPGA systems now incorporate either a soft processor or hard processor. A FPGA based AXI bus compatible vector accelerator architecture is proposed which utilises fully pipelined and heterogeneous ALU for performance, and microcoding is employed for reusability. The design is tested with several design examples in four different lane configurations. Compared with Central processing unit (CPU), Digital signal processor (DSP), Altera C2H tool and OpenCL SDK implementations, the vector processor improves on execution time and energy consumption by factors of up to 6.6 and 6.4 respectively.
Abstract: This paper describes a 4-cylinder hydraulic circuit system controlled by a new cross-coupled fuzzy PID, and its leveling algorithm is designed to meet the demand of the loading system for test bed. In order to enable synchronous loading and implement precise leveling control, the output displacement derived from a theoretical model, called Żn, is tracked in real-time. Using AMEsim software, the reliability of the scheme designed by master-slave control and fuzzy PID cross coupling control have been confirmed, and have drawn a conclusion that the more serious the inconsistencies of the load for 4-cylinder are, the more size corresponding synchronization errors will be. Further, leveling experiments have demonstrated that the synchronization performance of leveling system can be improved significantly by adjusting the load values for the 4-cylinder so as to meet the demand of its un-synchronization.
Abstract: To lower communication complexity, a Certificateless homomorphic encryption (CLHE) scheme based on the Learning with errors (LWE) problem is constructed by introducing a new technique called probabilistic encoding with weakly homomorphic property. This technique can conveniently convert an intended message into two elements in a ring, which will be respectively encrypted under both public keys of a user in certificateless cryptosystem. Upon knowing both elements simultaneously, the original message can be easily recovered. It is hidden perfectly by the probabilistic property of encoding. This CLHE removes evaluation keys by using the approximate eigenvector method given by Gentry et al., which makes it into a pure CLHE. It is proven to be semantic secure in the Random oracle model (ROM). The results indicate it is able to homomorphically evaluate any functions in a class functions with given multiplicative depth L.
Abstract: Category-based statistic language model is an important method to solve the problem of sparse data in statistical language models. But there are two bottlenecks about this model:1) The problem of word clustering, it is hard to find a suitable clustering method that has good performance and has not large amount of computation; 2) Class-based method always loses some prediction ability to adapt the text of different domain. In order to solve above problems, a novel definition of word similarity by utilizing mutual information was presented. Based on word similarity, the definition of word set similarity was given and a bottom-up hierarchical clustering algorithm was proposed. Experimental results show that the word clustering algorithm based on word similarity is better than conventional greedy clustering method in speed and performance, the perplexity is reduced from 283 to 207.8.
Abstract: Noise reduction is a very important topic in image processing. We propose a new method to deal with the case where the noisy image has different noise levels in different regions. The main idea is to segment automatically the noisy image into several sub-images so that each sub-image has approximately the same noise level. We perform Block matching 3D filtering (BM3D) to these subimages in order to obtain denoised sub-images. We then merge sub-images together and enhance the discontinuous regions between the sub-images by performing BM3D again on small image patches. Our experimental results show the effectiveness of this proposed method in terms of Peak signal to noise ratio (PSNR) when compared with the bivariate wavelet shrinkage and the standard BM3D method. In addition to Gaussian white noise, our method performs better than the bivariate wavelet shrinkage and the standard BM3D method even for signal dependent noise.
Abstract: To achieve high classification accuracy of hyperspectral data, a dimensionality reduction algorithm called Sample-dependent repulsion graph regularized auto-encoder (SRGAE) is proposed. Based on the sample-dependent graph, by applying the repulsion force to the samples from different classes but nearby, a sampledependent repulsion graph is built to make the samples from the same class will be projected to samples that are close-by and the samples from different classes will be projected to samples that are far away. The sampledependent repulsion graph can avoid the neighborhood parameter selection problem existing in the nearest neighborhood graph. By integrating advantages of deep learning and graph regularization technique, the SRGAE can maintain the learned deep features are consistent with the inherent manifold structure of the original hyperspectral data. Experimental results on two real hyperspectral data show that, when compared with some popular dimensionality reduction algorithms, the proposed SRGAE can yield higher classification accuracy.
Abstract: In this paper, an hierarchical n-gram Language model (LM) combining words and characters is explored to improve the detection of Out-of-vocabulary (OOV) words in Mandarin Spoken term detection (STD). The hierarchical LM is based on a word-level LM, with a character-level LM estimating probabilities of OOV words in a class-based way. The region containing OOV words in the sentence to be decoded is detected with the help of the word-level LM and the probabilities of OOV words are derived from the character-level LM. The implementation of the proposed approach is based on a dynamic decoder. The proposed approach is evaluated in terms of Actual term weighted value (ATWV) on two Mandarin data sets. Experiment results show that more than 10% relative improvement for OOV word detection is achieved on both sets. In addition, the detection of In-vocabulary (IV) words is barely influenced as well.
Abstract: Measured data fusion process is an effective way to improve the data process precision. In this paper, the fusion weight is firstly introduced, and then we study the optimal weight and parameter estimation using multistructure and unequal-precision data fusion. For the linear regression model, it is theoretically proved that the optimal weight is only related to the data measure precision, which is consistent with the classical Gauss-Markov theorem. For the nonlinear regression model, we analyze the method for calculating the optimal weight theoretically, and then provide the algorithm for the optimal weight and the parameter estimation for the actual data fusion.
Abstract: As wavelet packet transform is able to focus on minute change of signals, this study proposes an analytic approach of low embedding steganograpy based on high order Histogram moments in frequency domain (HMFD), which provides a key solution to the feature selection and extraction of HMFD. The detection results are tested with the LSB matching steganograpy of different embedding rates in speech signals, respectively, it is proved that the detection performance with HMFD applied is greater than that of histogram statistical moments. HMFD by Wavelet packet decomposition (WPD) can effectively detect low embedding rates Least significant bit (LSB) speech steganography, its accuracy can be 60.8% while the embedding rate is only 3%.
Abstract: With the rapid growth of inquiry in biomedicine concerning diseases, the recognition of diseases becomes especially important. But only the recognition of the biomedical concepts in literature is not enough, annotations and normalizations of the concepts with normalized Metathesaurus get even more important. This paper proposes a system to annotate the literature with normalized Metathesaurus. First, a two-phase Conditional random fields (CRFs) is used to recognize the disease mentions, including the location and identification. Then, the paper adapts the Disease ontology (DO) to annotate the diseases recognized for normalization by computing the similarity between disease mentions and concepts. According to the similarities, the disease mentions are denoted as disease concepts and instances distinctively. The experiments carried out on the Arizona disease corpus show that our system makes a good achievement and outperforms the other works.
Abstract: The objective of acoustic crosstalk cancellation is to use loudspeakers to deliver prescribed binaural signals (that reproduce a particular auditory scene) to a listener's ears, which is useful for 3-D audio applications. In practice, the actual transfer function matrix will differ from the design matrix, because of either the listener's head movement or rotation, etc. Crosstalk cancellation system (CCS) is very non-robust to these perturbations. Generally, in order to improve the robustness of CCS, several pairs of loudspeakers are needed whose position varies continuously as frequency varies. With the help of assumed stochastic analysis, we propose a stochastic robust approximation crosstalk cancellation method based on random perturbation matrix modeling the variations of the transfer function matrix. Under the free-field condition, simulation results demonstrate the effectiveness of the proposed method.
Abstract: This paper studies the properties of orbit matrix and gives a formula to compute the number of these orbit matrices on 4p variables, where p is an odd prime. It has been demonstrated that the construction of 1-resilient Rotation symmetric Boolean functions (RSBFs) on 4p variables is equivalent to solving an equation system. By the proposed method, all 1-resilient RSBFs on 12 variables can be constructed. We present a counting formula for the total number of all 1-resilient RSBFs on 4p variables. As application of our method, some 1-resilient RSBFs on 12 variables are presented.
Abstract: A DNA algorithm by operating on plasmids was presented to solve a special integer programming, a typical hard computing problem. The DNA algorithm employed double-stranded molecules to encode variables of 0-1 programming problem, the encoded DNA molecules were inserted into circular plasmids as foreign DNA molecules. Followed by, a series of enzymatic treatments to plasmids were performed in order to find feasible solutions to the given problem. The final optimum was obtained by applying founded feasible solutions to object function. Compared with other DNA algorithms of integer programming problem, the proposed algorithm is simple, error-resistant, above all, feasible. Our work clearly showed the distinct advantages of plasmid DNA computing model when solving integer related programming problem.
Abstract: Inspired by the approach of smart pirates dealing with their treasures, the paper proposed a scheme to protect the data security and privacy in cloud computing. In the proposed scheme, cloud data will be divided into some sequenced or logical blocks. These data blocks will be distributed among cloud storage service providers. Instead of protecting the data themselves, the proposed scheme protects the mapping of the data elements on each provider. Comprehensive analysis and simulation are designed to verify the proposed scheme, the results show that it is secure for cloud data, and the real implementation on Amazon S3 also indicates that the proposed scheme is feasible and efficient for the cloud data storage.
Abstract: Device-to-Device (D2D) communications have drawn considerable attention with the obvious advantages of a higher data rate and spectrum efficiency. However, this also brings intra-cell interference due to resource sharing with traditional Cellular users (CUs). An effective resource allocation scheme for D2D communications to maximize the system throughput is developed. This scheme first utilizes the guard area model to restrict the interference between D2D users (DUs) and CUs. Then, a max-flow algorithm is used to match the pair of CUs and DUs and maximize the total sum rate of the communication system. Numeral results demonstrate that the proposed scheme can yield significant throughput gain while maintaining quality for both CUs and DUs.
Abstract: Usually source localization using sensor networks requires many sensors to localize a few number of sources, and it is still very troublesome to deal with coherent sources. When the three-dimensional (3-D) space are considered, the localization will become more difficult. A new approach is proposed to localize 3-D wideband coherent sources based on distributed sensor network, which consists of two nodes and each node contains only two sensors. Direction-of-arrival (DOA) estimation is performed at each node by employing a new noise subspace proposed. Combining the pattern matching idea and the prior geometrical information of sources, a cost function is constructed to estimate the rough positions. A rotational projection algorithm is proposed to estimate the heights of sources and correct the rough positions, and consequently the localization of 3-D sources could be achieved. Numerical examples are provided to demonstrate the effectiveness of this approach.
Abstract: Detecting the forthcoming events timely is a primary way to minimize their (events) damages in distributed sensor systems. We propose a novel event detection framework for sensor networks with multi microenvironments using fuzzy sets. Fuzzy sets assisted event information description and fusion approaches are proposed, as well as two distributed Node-level forthcoming event detection (NFED) algorithms are devised respectively. Experimental results on both real-life and synthetic data sets demonstrate that our NFED-by leveraging Spatial correlations (NFED-bySC) algorithm only requires a small amount of data transmission with accuracy guarantee.
Abstract: To resolve the range migration and Doppler frequency spread, a new parameter estimation method, namely Dual-carrier frequency-Radon fractional Fourier transform (DCF-RFRFT), is proposed based on the dual-carrier frequency radar data. In this method, dual-carrier frequency radar data is constructed in the range frequency domain and a new signal is constructed by multiplying one with the complex conjugate of the other. Radon fractional Fourier transform (RFRFT) is performed to obtain the estimates of velocity and acceleration. This method does not need any prior-knowledge of targets and can resolve the Doppler ambiguity problem. The simulation results demonstrate the effectiveness of the proposed algorithm.
Abstract: Efficient parameter extraction method is essential to establish large signal statistical model. This paper presents an automatic parameter extraction method of I-V model for Gallium nitride (GaN) High electron mobility transistors (HEMTs) large signal statistical model. To accurate modeling the statistical characterization, all of 53 parameters in an I-V model are considered. In order to realize automatic parameter extraction, the model parameters are divided into blocks according to their physical meaning to reduce the complexity of the I-V model. Different parameter blocks are extracted separately by fitting the pulsed I-V transfer characteristic curves of the device at different quiescent bias points. A large signal statistical model for 0.25μm GaN HEMTs process has been established by using the proposed method after measuring 34 GaN HEMTs from 10 batches. The results show that the large-signal performances (Output power and Power added efficiency) can be reproduced with high accuracy by the proposed statistical model.
Abstract: To reduce the computational cost, a Fast position and velocity joint determination (FPVD) method based on the Near-maximum likelihood (NML) and the Least square method (LSM) is proposed for X-ray pulsar navigation. Considering the fact that the Doppler effects caused by the velocity of the spacecraft alters the X-ray pulse Time-of-arrival (TOA), we adopt the NML to obtain multi-TOAs, and then utilize the LSM to estimate the variation of TOAs instead of the pulsar profile distortion adopted by the Maximum-likelihood (ML) estimation method. Finally, according to the variation value and TOAs, the position and velocity information of the spacecraft are calculated. The simulation results have demonstrated that the FPVD method is far faster than the ML estimation method, and its accuracy approaches the CramerRao lower bound (CRLB).