By solving the existing expectation-signal-to-noise ratio (expectation-SNR) based inequality model of the closed-form instantaneous cross-correlation function type of Choi-Williams distribution (CICFCWD), the linear canonical transform (LCT) free parameters selection strategies obtained are usually unsatisfactory. Since the second-order moment variance outperforms the first-order moment expectation in accurately characterizing output SNRs, this paper uses the variance analysis technique to improve parameters selection strategies. The CICFCWD’s average variance of deterministic signals embedded in additive zero-mean stationary circular Gaussian noise processes is first obtained. Then the so-called variance-SNRs are defined and applied to model a variance-SNR based inequality. A stronger inequalities system is also formulated by integrating expectation-SNR and variance-SNR based inequality models. Finally, a direct application of the system in noisy one-component and bi-component linear frequency-modulated signals detection is studied. Analytical algebraic constraints on LCT free parameters newly derived seem more accurate than the existing ones, achieving better noise suppression effects. Our methods have potential applications in optical, radar, communication and medical signal processing.
Protein localization information is essential for understanding protein functions and their roles in various biological processes. The image-based prediction methods of protein subcellular localization have emerged in recent years because of the advantages of microscopic images in revealing spatial expression and distribution of proteins in cells. However, the image-based prediction is a very challenging task, due to the multi-instance nature of the task and low quality of images. In this paper, we propose a multi-task learning strategy and mask generation to enhance the prediction performance. Furthermore, we also investigate effective multi-instance learning schemes. We collect a large-scale dataset from the Human Protein Atlas database, and the experimental results show that the proposed multi-task multi-instance learning model outperforms both single-instance learning and common multi-instance learning methods by large margins.
Satellites based positioning has been widely applied to many areas in our daily lives and thus become indispensable, which also leads to increasing demand for high-positioning accuracy. In some complex environments (such as dense urban, valley), multipath interference is one of the main error sources deteriorating positioning accuracy, and it is difficult to eliminate via differential techniques due to its uncertainty of occurrence and irrelevance in different instants. To address this problem, we propose a positioning method for global navigation satellite systems (GNSS) by adopting a modified teaching-learning based optimization (TLBO) algorithm after the positioning problem is formulated as an optimization problem. Experiments are conducted by using actual satellite data. The results show that the proposed positioning algorithm outperforms other algorithms, such as particle swarm optimization based positioning algorithm, differential evolution based positioning algorithm, variable projection method, and TLBO algorithm, in terms of accuracy and stability.
The triangular geometry is the basis of near field array accurate distance estimation algorithms. The fisher expression of traditional distance estimation is derived by utilizing the Taylor series. To improve convergence rate and estimation accuracy, a novel iterative distance estimation algorithm is proposed with differential equations based on the triangular geometry. Firstly, its convergence performance is analysed in detail. Secondly, the selection of the initial value and the number of iterations are respectively studied. Thirdly, compared with the traditional estimation algorithms by utilizing the fisher approximation, the proposed algorithm has a higher convergence rate and estimation accuracy. Moreover, its pseudocode is presented. Finally, the experiment results and performance analysis are provided to verify the effectiveness of the proposed algorithm.
A common but critical task in biological ontologies data analysis is to compare the difference between ontologies. There have been numerous ontology-based semantic-similarity measures proposed in specific ontology domain, but it still remains a challenge for cross-domain ontologies comparison. An ontology contains the scientific natural language description for the corresponding biological aspect. Therefore, we develop a new method based on natural language processing (NLP) representation model bidirectional encoder representations from transformers (BERT) for cross-domain semantic representation of biological ontologies. This article uses the BERT model to represent the word-level of the ontologies as a set of vectors, facilitating the semantic analysis or comparing the biomedical entities named in an ontology or associated with ontology terms. We evaluated the ability of our method in two experiments: calculating similarities of pair-wise disease ontology and human phenotype ontology terms and predicting the pair-wise of proteins interaction. The experimental results demonstrated the comparative performance. This gives promise to the development of NLP methods in biological data analysis.
Recently, many deep learning models have shown excellent performance in hyperspectral image (HSI) classification. Among them, networks with multiple convolution kernels of different sizes have been proved to achieve richer receptive fields and extract more representative features than those with a single convolution kernel. However, in most networks, different-sized convolution kernels are usually used directly on multi-branch structures, and the image features extracted from them are fused directly and simply. In this paper, to fully and adaptively explore the multiscale information in both spectral and spatial domains of HSI, a novel multi-scale weighted kernel network (MSWKNet) based on an adaptive receptive field is proposed. First, the original HSI cubic patches are transformed to the input features by combining the principal component analysis and one-dimensional spectral convolution. Then, a three-branch network with different convolution kernels is designed to convolve the input features, and adaptively adjust the size of the receptive field through the attention mechanism of each branch. Finally, the features extracted from each branch are fused together for the task of classification. Experiments on three well-known hyperspectral data sets show that MSWKNet outperforms many deep learning networks in HSI classification.
To reduce the overhead and complexity of channel state information acquisition in interference alignment, the topological interference management (TIM) was proposed to manage interference, which only relied on the network topology information. The previous research on topological interference management via the low-rank matrix completion approach is known to be NP-hard. This paper considers the clustering method for the topological interference management problem, namely, the low-rank matrix completion for TIM is applied within each cluster. Based on the clustering result, we solve the low-rank matrix completion problem via nuclear norm minimization and Frobenius norm minimization function. Simulation results demonstrate that the proposed clustering method combined with TIM leads to significant gain on the achievable degrees of freedom.
Carotid artery stenosis is a serious medical condition that can lead to stroke. Using machine learning method to construct classifier model, carotid artery stenosis can be diagnosed with transcranial doppler data. We propose an improved fuzzy support vector machine (FSVMI) model to predict carotid artery stenosis, with the maximum geometric mean (Gmean) as the optimization target. The fuzzy membership function is obtained by combining information entropy with the normalized class-center distance. Experimental results showed that the proposed model was superior to the benchmark models in sensitivity and geometric mean criteria.
Optimal trajectory planning is a fundamental problem in the area of robotic research. On the time-optimal trajectory planning problem during the motion of a robotic arm, the method based on segmented polynomial interpolation function with a locally chaotic particle swarm optimization (LCPSO) algorithm is proposed in this paper. While completing the convergence in the early or middle part of the search, the algorithm steps forward on the problem of local convergence of traditional particle swarm optimization (PSO) and improved learning factor PSO (IFPSO) algorithms. Finally, simulation experiments are executed in joint space to obtain the optimal time and smooth motion trajectory of each joint, which shows that the method can effectively shorten the running time of the robotic manipulator and ensure the stability of the motion as well.
In rail transit systems, improving transportation efficiency has become a research hotspot. In recent years, a method of train control system based on virtual coupling has attracted the attention of many scholars. And the train operation control method is not only the key to realize the virtual coupling train operation control system but also the key to prevent accidents. Therefore, based on the existing research, a virtual coupled train dynamics model with nonlinear dynamics is established. Then, the recursive least square method based on the train running process data is used to identify the model parameters of the nonlinear dynamics virtual coupling train coupling process, and it is applied to the variable parameter artificial potential field (VAPF) to identify the parameters. A fusion controller based on feature-based generalized model prediction (GPC) and VAPF is used to control the virtual coupled train and prevent collision. Finally, a section of Beijing-Shanghai high-speed railway is taken as the background to verify the effectiveness of the proposed method.
Adopt software-definition technology to decouple the functional components of the industrial control system (ICS) in a service-oriented and distributed form is an important way for the industrial Internet of things to integrate information technology, communication technology, and operation technology. Therefore, this paper presents the concept of software-defined control architecture and describes the time consistency requirements under the paradigm shift of ICS architecture. By analyzing the physical clock and virtual clock mechanism models, the global clock synchronization space is logically divided into the physical and virtual clock synchronization domains, and a formal description of the global clock synchronization space is proposed. According to the fundamental analysis of the clock state model, the physical clock linear filtering synchronization model is derived, and a distributed observation fusion filtering model is constructed by considering the two observation modes of the virtual clock to realize the time synchronization of the global clock space by way of timestamp layer-by-layer transfer and fusion estimation. Finally, the simulation results show that the proposed model can significantly improve the accuracy and stability of clock synchronization.
With the recent increase in the number of Internet of Things (IoT) services, an intelligent scheduling strategy is needed to manage these services. In this paper, the problem of automatic choreography of microservices in IoT is explored. A type of reinforcement learning (RL) algorithm called TD3 is used to generate the optimal choreography policy under the framework of a softwaredefined network. The optimal policy is gradually reached during the learning procedure to achieve the goal, despite the dynamic characteristics of the network environment. The simulation results show that compared with other methods, the TD3 algorithm converges faster after a certain number of iterations, and it performs better than other non-RL algorithms by obtaining the highest reward. The TD3 algorithm can effciently adjust the traffc transmission path and provide qualified IoT services.
Multi-task learning is an essential yet practical mechanism for improving overall performance in various machine learning fields. Owing to the linguistic hierarchy, the hierarchical joint model is a common architecture used in natural language processing. However, in the state-of-the-art hierarchical joint models, higher-level tasks only share bottom layers or latent representations with lower-level tasks thus ignoring correlations between tasks at different levels, i.e., lower-level tasks cannot be instructed by the higher features. This paper investigates how to advance the correlations among various tasks supervised at different layers in an end-to-end hierarchical joint learning model. We propose a semi-shared hierarchical model that contains cross-layer shared modules and layer-specific modules. To fully leverage the mutual information between various tasks at different levels, we design four different dataflows of latent representations between the shared and layer-specific modules. Extensive experiments on CTB-7 & CONLL-09 show that our semi-shared approach outperforms basic hierarchical joint models on sequence tagging while having much fewer parameters. It inspires us that the proper implementation of the cross-layer sharing mechanism and residual shortcuts is promising to improve the performance of hierarchical joint NLP models while reducing the model complexity.
A family of binary sequences derived from Euler quotients $\psi(\cdot)$ with RSA modulus $pq$ is introduced. Here two primes $p$ and $q$ are distinct and satisfy $\gcd(pq, (p-1)(q-1))=1$. The linear complexities and minimal polynomials of the proposed sequences are determined. Besides, this kind of sequences is shown not to have correlation of order $four$, although there exists the following relation $\psi(t)-\psi(t+p^2q)-\psi(t+q^2p)+\psi(t+(p+q)pq)=$$0 \pmod {pq}$ for any integer $t$ by the properties of Euler quotients.
Quantum algorithms are raising concerns in the field of cryptography all over the world. A growing number of symmetric cryptography algorithms have been attacked in the quantum setting. Type-3 generalized Feistel scheme (GFS) and unbalanced Feistel scheme with expanding functions (UFS-E) are common symmetric cryptography schemes, which are often used in cryptographic analysis and design. We propose quantum attacks on the two Feistel schemes. For $d$-branch Type-3 GFS and UFS-E, we propose distinguishing attacks on $(d+1)$-round Type-3 GFS and UFS-E in polynomial time in the quantum chosen plaintext attack (qCPA) setting. We propose key recovery by applying Grover's algorithm and Simon's algorithm. For $r$-round $d$-branch Type-3 GFS with $k$-bit length subkey, the complexity is $O({2^{(d - 1)(r - d - 1)k/2}})$ for $r\ge d + 2$. The result is better than that based on exhaustive search by a factor ${2^{({d^2} - 1)k/2}}$. For $r$-round $d$-branch UFS-E, the attack complexity is $O({2^{(r - d - 1)(r - d)k/4}})$ for $d + 2 \le r \le 2d$, and $O({2^{(d - 1)(2r - 3d)k/4}})$ for $r > 2d$. The results are better than those based on exhaustive search by factors ${2^{(4rd - {d^2} - d - {r^2} - r)k/4}}$ and ${2^{3(d - 1)dk/4}}$ in the quantum setting, respectively.
Color quick response (QR) code is an important direction for the future development of QR code, which has become a research hotspot due to the additional functional characteristics of its colors as the wide application of QR code technology. The existing color QR code has solved the problem of information storage capacity, but it requires an enormous hardware and software support system, making how to achieve its direct readability an urgent issue. This paper proposes a novel color QR code that combines multiple types of different identification information. This code combines multiplexing and color-coding technology to present the publicly encoded information (such as advertisements, public query information) as plain code, and traceability, blockchain, anti-counterfeiting authentication and other information concealed in the form of hidden code. We elaborate the basic principle of this code, construct its mathematical model and supply a set of algorithm design processes, which breakthrough key technology of halftone printout. The experimental results show that the proposed color QR code realizes the multi-code integration and can be read directly without special scanning equipment, which has unique advantages in the field of printing anti-counterfeiting labels.
A synchronous GNSS generator spoofer aims at directly taking over the tracking loops of the receiver with the lowest possible spoofing to signal ratio (SSR) without forcing it to lose lock. This paper investigates the factors that affect spoofing success and their relationships. The necessary conditions for successful spoofing are obtained by deriving the code tracking error in the presence of spoofing and analyzing the effects of SSR, spoofing synchronization errors, and receiver settings on the S-curve ambiguity and code tracking trajectory. The minimum SSRs for a successful spoofing calculated from the theoretical formulation agree with Monte Carlo simulations at digital intermediate frequency signal level within 1 dB when the spoofer pulls the code phase in the same direction as the code phase synchronization error, and the required SSRs can be much lower when pulling in the opposite direction. The maximum spoofing code phase error for a successful spoofing is tested by using TEXBAT datasets, which coincides with the theoretical results within 0.1 chip. This study reveals the mechanism of covert spoofing and can play a constructive role in the future development of spoofing and anti-spoofing methods.
synthetic aperture radar (SAR) imaging is an efficient strategy which exploits the properties of microwaves to capture images. A major concern in SAR imaging is the reconstruction of image from back scattered signals in the presence of noise. The reflected signal consist of more noise than the target signal and it is a challenging problem to reduce the noise in the collected signal for better reconstruction of an image. Current studies mostly focus on filtering techniques for noise removal. This can result in an undesirable point spread function (PSF) causing extreme smearing effect in the desired image. In order to handle this problem, a computational technique, particle swarm optimization (PSO) is used for de-noising purpose and later the target performance is further improved by an amalgamation of Wiener filter. Moreover, to improve the de-noising performance we have exploited the singular value decomposition based morphological filtering. To justify the proposed improvements we have simulated the proposed techniques and results are compared with the conventional existing models. The proposed method revealed considerable decrease in mean square error (MSE) compared to Wiener filter and PSO techniques. Quantitative analysis of image restoration quality are also presented in comparison with Wiener filter and PSO based on the improvement in signal to noise ratio (ISNR) and peak signal to noise ratio (PSNR).
In this paper, we derive and propose a track-oriented marginal Poisson multi-Bernoulli mixture (TO-MPMBM) filter to address the problem that the standard random finite set (RFS) filters cannot build continuous trajectories for multiple extended targets. Firstly, the Poisson point process (PPP) model and the multi-Bernoulli mixture (MBM) model are used to establish the set of birth trajectories and the set of existing trajectories, respectively. Secondly, the proposed filter recursively propagates the marginal association distributions and the Poisson multi-Bernoulli mixture (PMBM) density over the set of alive trajectories. Finally, after pruning and merging process, the trajectories with existence probability greater than the given threshold are extracted as the estimated target trajectories. A comparison of the proposed filter with the existing trajectory filters in two classical scenarios confirms the validity and reliability of the TO-MPMBM filter.
The robustness of adversarial examples to image scaling transformation is usually ignored when most existing adversarial attacks are proposed. In contrast, image scaling is often the first step of the model to transfer various sizes of input images into fixed ones. We evaluate the impact of image scaling on the robustness of adversarial examples applied to image classification tasks. We set up an image scaling system to provide a basis for robustness evaluation and conduct experiments in different situations to explore the relationship between image scaling and the robustness of adversarial examples. Experiment results show that various scaling algorithms have a similar impact on the robustness of adversarial examples, but the scaling ratio significantly impacts it.
In this paper, we propose a low complexity distributed approach to address the multitarget detection/tracking problem in the presence of noisy and missing data. The proposed approach consists of two components: a distributed flooding scheme for measurements exchanging among sensors and a sampling-based clustering approach for target detection/tracking from the aggregated measurements. The main advantage of the proposed approach over the prevailing Markov-Bayes-based distributed filters is that it does not require any priori information and all the information required is the measurement set from multiple sensors. A comparison of the proposed approach with the available distributed clustering approaches and the cutting edge distributed multi-Bernoulli filters that are modeled with appropriate parameters confirms the effectiveness and the reliability of the proposed approach.
The weighted sampling methods based on k-nearest neighbors have been demonstrated to be effective in solving the class imbalance problem. However, they usually ignore the positional relationship between a sample and the heterogeneous samples in its neighborhood when calculating sample weight. This paper proposes a novel neighborhood-weighted based (NWBBagging) sampling method to improve the Bagging algorithm’s performance on imbalanced datasets. It considers the positional relationship between the center sample and the heterogeneous samples in its neighborhood when identifying critical samples. And a parameter reduction method is proposed and combined into the ensemble learning framework, which reduces the parameters and increases the classifier’s diversity. We compare NWBBagging with some stateof-the-art ensemble learning algorithms on 34 imbalanced datasets, and the result shows that NWBBagging achieves better performance.
Samples collected from most industrial processes have two challenges: one is contaminated by the non-Gaussian noise, and the other is gradually obsolesced. This feature can obviously reduce the accuracy and generalization of models. To handle these challenges, a novel method, named the robust online extreme learning machine (RO-ELM), is proposed in this paper, in which the least mean $\boldsymbol{p}$-power criterion is employed as the cost function which is to boost the robustness of the ELM, and the forgetting mechanism is introduced to discard the obsolescence samples. To investigate the performance of the RO-ELM, experiments on artificial and real-world datasets with the non-Gaussian noise are performed, and the datasets are from regression or classification problems. Results show that the RO-ELM is more robust than the ELM, the online sequential ELM (OS-ELM) and the OS-ELM with forgetting mechanism (FOS-ELM). The accuracy and generalization of the RO-ELM models are better than those of other models for online learning.
To solve the problem of semantic loss in text representation, this paper proposes a new embedding method of word representation in semantic space called wt2svec based on supervised latent Dirichlet allocation (SLDA) and Word2vec. It generates the global topic embedding word vector utilizing SLDA which can discover the global semantic information through the latent topics on the whole document set. It gets the local semantic embedding word vector based on the Word2vec. The new semantic word vector is obtained by combining the global semantic information with the local semantic information. Additionally, the document semantic vector named doc2svec is generated. The experimental results on different datasets show that wt2svec model can obviously promote the accuracy of the semantic similarity of words, and improve the performance of text categorization compared with Word2vec.
Existing neural approaches have achieved significant progress for Chinese word segmentation (CWS). The performances of these methods tend to drop dramatically in the cross-domain scenarios due to the data distribution mismatch across domains and the out of vocabulary words problem. To address these two issues, proposes a lexicon-augmented graph convolutional network for cross-domain CWS. The novel model can capture the information of word boundaries from all candidate words and utilize domain lexicons to alleviate the distribution gap across domains. Experimental results on the cross-domain CWS datasets (SIGHAN-2010 and TCM) show that the proposed method successfully models information of domain lexicons for neural CWS approaches and helps to achieve competitive performance for cross-domain CWS. The two problems of cross-domain CWS can be effectively solved through various interactions between characters and candidate words based on graphs. Further, experiments on the CWS benchmarks (Bakeoff-2005) also demonstrate the robustness and efficiency of the proposed method.
Due to insufficient data and high cost of data annotation, it is usually necessary to use knowledge transfer to recognize speech emotion. However, the uncertainty and subjectivity of emotion make speech emotion recognition based on transfer learning more challenging. Domain adaptation based on maximum mean discrepancy considers the marginal alignment of source domain and target domain, but without paying regard to the class prior distribution in both domains, which reduces the transfer efficiency. To solve this problem, a novel cross-corpus speech emotion recognition framework based on local domain adaption is proposed, in which a local weighted maximum mean discrepancy is used to evaluate the distance between different emotion datasets. Experimental results show that the cross-corpus speech emotion recognition has been improved when compared with other cross-corpus methods including global domain adaptation and cross-corpus speech emotion recognition directly.
Deep-learning-based language models have improved generation-based linguistic steganography, posing a huge challenge for linguistic steganalysis. The existing neural-network-based linguistic steganalysis methods are incompetent to deal with complicated text because they only extract single-granularity features such as global or local text features. To fuse multi-granularity text features, we present a novel linguistic steganalysis method based on attentional LSTMs and short-cut dense CNNs (BiLSTM-SDC). The BiLSTM equipped with the scaled dot-product attention mechanism is used to capture the long dependency representations of the input sentence. The CNN with the short-cut and dense connection is exploited to extract sufficient local semantic features from the word embedding matrix. We connect two structures in parallel, concatenate the long dependency representations and the local semantic features, and classify the stego and cover texts. The results of comparative experiments demonstrate that the proposed method is superior to the previous state-of-the-art linguistic steganalysis.
With the development of deep learning, graph neural networks (GNNs) have yielded substantial results in various application fields. GNNs mainly consider the pair-wise connections and deal with graph-structured data. In many real-world networks, the relations between objects are complex and go beyond pair-wise. Hypergraph is a flexible modeling tool to describe intricate and higher-order correlations. The researchers have been concerned how to develop hypergraph-based neural network model. The existing hypergraph neural networks show better performance in node classification tasks and so on, while they are shallow network because of over-smoothing, over-fitting and gradient vanishment. To tackle these issues, we present a novel deep hypergraph neural network (DeepHGNN). We design DeepHGNN by using the technologies of sampling hyperedge, residual connection and identity mapping, residual connection and identity mapping bring from GCNs. We evaluate DeepHGNN on two visual object datasets. The experiments show the positive effects of DeepHGNN, and it works better in visual object classification tasks.
Few-Shot learning (FSL) is a new machine learning method that applies the prior knowledge from some different domains tasks. The existing FSL models of metric-based learning have some drawbacks, such as the extracted features cannot reflect the true data distribution and the generalization ability is weak. In order to solve the problem in the present, we develop a model, named COOPERATE (CrOss mOdal adaPtive fEw-shot leaRning bAsed on Task dEpendence). A feature extraction and task representation method based on task condition network and auxiliary co-training is proposed. Semantic representation is added to each task by combining both visual and textual features. The measurement scale is adjusted to change the property of parameter update of the algorithm. The experimental results show that the COOPERATE has the better performance comparing with all approaches of the monomode and modal alignment FSL.
Since the basic probability of an interval-valued belief structure (IBS) is assigned as interval number, its combination becomes difficult. Especially, when dealing with highly conflicting IBSs, most of the existing combination methods may cause counter-intuitive results, which can bring extra heavy computational burden due to nonlinear optimization model, and lose the good property of associativity and commutativity in Dempster-Shafer theory (DST). To address these problems, a novel conflicting IBSs combination method named CSUI (conflict, similarity, uncertainty, intuitionistic fuzzy sets)-DST method is proposed by introducing a similarity measurement to measure the degree of conflict among IBSs, and an uncertainty measurement to measure the degree of discord, non-specificity and fuzziness of IBSs. Considering these two measures at the same time, the weight of each IBS is determined according to the modified reliability degree. From the perspective of intuitionistic fuzzy sets, we propose the weighted average IBSs combination rule by the addition and number multiplication operators. The effectiveness and rationality of this combination method are validated with two numerical examples and its application in target recognition.
The interactive multiple-model (IMM) is a popular choice for target tracking. However, to design transition probability matrices (TPMs) for IMMs is a considerable challenge with less prior knowledge, and the TPM is one of the fundamental factors influencing IMM performance. IMMs with inaccurate TPMs can make it difficult to monitor target maneuvers and bring poor tracking results. To address this challenge, we propose an adaptive IMM algorithm based on end-to-end learning. In our method, the neural network is utilized to estimate TPMs in real-time based on partial parameters of IMM in each time step, resulting in a generalized recurrent neural network. Through end-to-end learning in the tracking task, the dataset cost of the proposed algorithm is smaller and the generalizability is stronger. Simulation and automatic dependent surveillance-broadcast (ADS-B) tracking experiment results show that the proposed algorithm has better tracking accuracy and robustness with less prior knowledge.
Malware detection has been a hot spot in cyberspace security and academic research. We investigate the correlation between the opcode features of malicious samples and perform feature extraction, selection and fusion by filtering redundant features, thus alleviating the dimensional disaster problem and achieving efficient identification of malware families for proper classification. Malware authors use obfuscation technology to generate a large number of malware variants, which imposes a heavy analysis burden on security researchers and consumes a lot of resources in both time and space. To this end, we propose the MalFSM framework. Through the feature selection method, we reduce the 735 opcode features contained in the Kaggle dataset to 16, and then fuse on metadata features (count of file lines and file size) for a total of 18 features, and find that the machine learning classification is efficient and high accuracy. We analyzed the correlation between the opcode features of malicious samples and interpreted the selected features. Our comprehensive experiments show that the highest classification accuracy of MalFSM can reach up to 98.6% and the classification time is only 7.76s on the Kaggle malware dataset of Microsoft.
A software ecosystem (SECO) can be described as a special complex network. Previous complex networks in an SECO have limitations in accurately reflecting the similarity between each pair of nodes. The community structure is critical towards understanding the network topology and function. Many scholars tend to adopt evolutionary optimization methods for community detection. The information adopted in previous optimization models for community detection is incomprehensive and cannot be directly applied to the problem of community detection in an SECO. Based on this, a complex network in SECOs is first built. In the network, the cooperation intensity between developers is accurately calculated, and the attribute contained by each developer is considered. A multi-objective optimization model is formulated. A community detection algorithm based on NSGA-II is employed to solve the above model. Experimental results demonstrate that the proposed method of calculating the developer cooperation intensity and our model are advantageous.
This paper proposed a novel design method for pyramid horns which are under the constraints of 3 dB beamwidth. It is based on the general radiation patterns of E\H planes derived from Huygens’ principle. Through interpolation and fitting techniques, the E\H plane’s maximum aperture error parameter of the pyramid horn is obtained as a function of the angle and aperture electrical size. Firstly, the aperture size of the E (or H) plane is calculated with the help of the optimal gain principle. Secondly, the constraint equation of another plane is derived. Finally, the intersection of constraint equation and interpolation function, which can be solved iteratively, contains all the solution information. The general radiation patterns neglect the influence of the Huygens element factor which makes the error bigger in large design beamwidth. In this paper, through theoretical analysis and simulation experiments, two correction formulas are employed to correct the Huygens element factor’s influence on the E\H planes. Simulation experiments and measurements show that the proposed method has a smaller design error in the range of 0–60 degrees half-power beamwidth.
Pre-mRNA splicing is an essential procedure for gene transcription. Through the cutting of introns and exons, the DNA sequence can be decoded into different proteins related to different biological functions. The cutting boundaries are defined by the donor and acceptor splice sites. Characterizing the nucleotides patterns in detecting splice sites is sophisticated and challenges the conventional methods. Recently, the deep learning frame has been introduced in predicting splice sites and exhibits high performance. It extracts high dimension features from the DNA sequence automatically rather than infers the splice sites with prior knowledge of the relationships, dependencies, and characteristics of nucleotides in the DNA sequence. This paper proposes the AttentionSplice model, a hybrid construction combined with multi-head self-attention, convolutional neural network (CNN), bidirectional long short-term memory (Bi-LSTM) network. The performance of AttentionSplice is evaluated on the Homo sapiens (Human) and Caenorhabditis Elegans (Worm) datasets. Our model outperforms state-of-the-art models in the classification of splice sites. To provide interpretability of AttentionSplice models, we extract important positions and key motifs which could be essential for splice site detection through the attention learned by the model. Our result could offer novel insights into the underlying biological roles and molecular mechanisms of gene expression.
Compared with cloud computing environment, edge computing has many choices of service providers due to different deployment environments. The flexibility of edge computing makes the environment more complex. The current edge computing architecture has the problems of scattered computing resources and limited resources of single computing node. When the edge node carries too many task requests, the makespan of the task will be delayed. We propose a load balancing algorithm based on weighted bipartite graph for edge computing (LBA-EC), which makes full use of network edge resources, reduces user delay, and improves user service experience. The algorithm is divided into two phases for task scheduling. In the first phase, the tasks are matched to different edge servers. In the second phase, the tasks are optimally allocated to different containers in the edge server to execute according to the two indicators of energy consumption and completion time. The simulations and experimental results show that our algorithm can effectively map all tasks to available resources with a shorter completion time.
Phrase-indexed question answering (PIQA) seeks to improve the inference speed of question answering (QA) models by enforcing complete independence of the document encoder from the question encoder, and it shows that the constrained model can achieve significant efficiency at the cost of its accuracy. In this paper, we aim to build a model under the PIQA constraint while reducing its accuracy gap with the unconstrained QA models. We propose a novel framework—AnsDR, which consists of an answer boundary detector (AnsD) and an answer candidate ranker (AnsR). More specifically, AnsD is a QA model under the PIQA architecture and it is designed to identify the rough answer boundaries; and AnsR is a lightweight ranking model to finely re-rank the potential candidates without losing the efficiency. We perform the extensive experiments on public datasets. The experimental results show that the proposed method achieves the state of the art on the PIQA task.
Thinking space came into being with the emergence of human civilization. With the emergence and development of cyberspace, the interaction between those two spaces began to take place. In the collision of thinking and technology, new changes have taken place in both thinking space and cyberspace. To this end, this paper divides the current integration and development of thinking space and cyberspace into three stages, namely Internet of brain (IoB), Internet of thought (IoTh), and Internet of thinking (IoTk). At each stage, the contents and technologies to achieve convergence and connection of spaces are discussed. Besides, the Internet of creation (IoC) is proposed to represent the future development of thinking space and cyberspace. Finally, a series of open issues are raised, and they will become thorny factors in the development of the IoC stage.