
Citation: | CHEN Qiuling, YE Ayong, ZHANG Qiang, HUANG Chuan. A New Edge Perturbation Mechanism for Privacy-Preserving Data Collection in IOT[J]. Chinese Journal of Electronics, 2023, 32(3): 603-612. DOI: 10.23919/cje.2021.00.411 |
The humongous amount of data generation transforms the collection and publication of micro data [1]. It is well known that data collection [2] is the basis of Internet of things (IoT) big data applications [3]–[5]. The collected data is transferred to the cloud server and various practical applications can be achieved by analyzing the data. For example, the analysis of the collected medical data can assist medical institutions to establish mechanisms for tracking the risk of diseases in patients or help pharmaceutical companies to improve the clinical use of drugs. As a result, more and more data is being collected and analyzed in real time through sensors, wearable devices, smart sensing, video capture and other technologies. However, uploading data to cloud-based servers not only imposes significant latency and heavy communication burden, but also the data is transmitted without stricting privacy guarantees between the process and the center server. Edge computing [6] utilizes personal devices and nearby infrastructure to process data and migrates the analysis of sensitive data from cloud servers to edge servers, which can effectively reduce the risk of leakage from the center server. Therefore, the concept of edge computing based on IoT to guarantee the data usability while protecting privacy is an important research direction.
Currently, many information security and privacy protection techniques [7], [8] based on the IoT environment have been proposed. Among them, “data distortion based” techniques primarily distort sensitive data and can keep certain data or attributes unchanged [9]. Commonly used distortion methods contain randomization, swapping, blocking and enrichment, differential privacy [10], etc. There are two main perturbation mechanisms: centralized perturbation and local perturbation. The centralized perturbation is based on the premise of a trusted data collector, which is difficult to implement in practice, since we cannot guarantee that the data collector will never violate the user’s data privacy or be subject to other attacks.
To solve the aforementioned problems, local perturbation has been presented to protect the privacy of users [11], [12]. It extends differential privacy to local privacy, can resist adversaries with any background knowledge, and distributes a randomization process to prevent leakage from the data collector. However, each user disturbs his own data and submits it to the data collector, so none of users knows the other’s data records. That means, there is no concept of global sensitivity in local perturbation, which will result in reduced protection against disclosure risk due to sampling error. Sampling error may produce different results from the analysis of perturbed data compared to the raw data, reducing data utility. In addition, local differences have both positive and negative perturbation effects on individual data, and a large number of perturbed results need to be aggregated to offset the positive and negative noise added to the data in order to obtain valid statistical results, so it cannot satisfy the needs of small data sets.
In order to overcome the problems of the above perturbation mechanisms, we develop a new edge perturbation model to protect the user’s privacy information. Meanwhile, based on the existing model, a global noise generation algorithm is proposed. The main contributions of this paper are as follows:
1) We present a new edge perturbation mechanism based on the concept of global sensitivity to protect the sensitive information in IoT data collection. Unlike the centralized perturbation, the edge server is used as a suitable place to sanitize the user’s sensitive data instead of uploading them to the center server. Being awareof the global sensitivity, edge perturbation can not only solve privacy issues caused by untrusted center server, but also achieve better data utility than local perturbation.
2) We also propose a global noise generation algorithm for edge perturbation. The global noise is generated by the center server through summing up local noises produced by edge servers and calculating their mean value. The sensitive data will be disturbed by the global noise, which can reduce sampling error and ensure that the results of commonly performed statistical analyses are identical and equal for both the raw and the perturbed data.
3) Experimental results demonstrate the effectiveness of the proposed scheme. The perturbed data can still maintain the same statistical properties as the raw data. In addition, the proposed scheme is compared with the local differential privacy approach, which is more effective in preserving the sensitive attributes while maintaining the data utility with uniform privacy security, and it has better adaptability to small data sets.
The rest of this paper is arranged in the following order. In Section II, we discuss the related work. Section III introduces the Preliminaries and gives the related definitions used and data utility in this paper. In Section IV, we introduce our scheme in details and present a globe perturbation algorithm. Section V presents the utility analysis and disclosure risk of our proposed edge perturbation. In Section VI, the experimental analysis is carried out. In Section VII, we conclude the paper and put forward the future research direction.
There has been a number of recent works on the privacy protection of data, which mainly include data exchange [13], [14],
Recently, a number of works have been reported by using perturbation methods to protect data privacy. The main idea is to add noise to desensitize the data before uploading them to the perception platform. The added noise can not only effectively protect the user’s personal privacy, but also keep the statistical result unchanged. For data collection based on data perturbation, Krishnamurty et al. in [22] proposed firstly a data perturbation method for small data sets. Its characteristics are: the results of common statistical analysis of disturbed data are the same as those of original data. But this method has some limitations. Therefore, Tian et al. in [23] proposed a privacy protection mechanism, which ensures that the server does not know the identities of participants and provides relative information for the perception task. This scheme can protect the user’s privacy, but the data after multiple encryption and decryption will cause huge computational overhead. In order to reduce unnecessary overhead, Wang et al. in [24] proposed an anonymous data collection model. The model adopts peer-to-peer network to assist anonymous data transmission and protect the sender’s identity. However, the above methods can not resist background knowledge attacks and can not quantitatively analyze the privacy leakage risk of data.
Therefore, Lv et al. in [25] proposed a differential privacy protection method based on machine learning and maximum information coefficient. On this basis, a special privacy protection model is proposed. Firstly, the correlation sensitivity between data is accurately calculated. Then, The idea of clustering is used to realize the differential privacy protection of the whole big data association. However, the data privacy process always depends on the trusted third-party data collector, which has a certain impact on the development of differential privacy technology. Therefore, Kim et al. in [26] proposed a method of personal data collection based on local differential privacy, which further improves the protection of personal privacy information. The user desensitizes the data by himself, and realizes the availability of data while protecting the user’s privacy. Specifically, in the case of given available targets, the method finds the optimal data disturbance scheme based on local differential privacy, which ensures the minimum total error in the disturbing process. However, the method can not maintain the relationship between data, and the errors in the process of disturbance need to be offset by a large number of positive and negative noises. For single data or small data sets, it will cause large system errors.
As we know, the above data perturbation methods can be divided into centralized perturbation and local perturbation. Centralized perturbation disturbs the collected data through the center server. Since the center server collects all data of users, it will cause a serious threat to users’ privacy if the center server is attacked internally or the service provider leaks personal data for commercial interests. Furthermore, local perturbation refers to the data perturbation at the client side. However, the user’s personal data set may be small, it is easy to generate a large sampling error in data perturbation processing and reduce data utility. Therefore, we develop a new edge perturbation mechanism in IoT data collection.
In general, some of the user data collected is sensitive and cannot be disclosed. For example, names, addresses, medical expenses, and disease information, etc. These are represented using sensitive attributes. And some data is not sensitive and can be disclosed. For example, gender and age, etc., which are expressed using public attributes. Therefore, assuming that the collected user data set is D, each user’s data can be divided into sensitive attribute U and public attribute V. To simplify the calculation process, we assume that the dimensions of the sensitive attributes are the same as the dimensions of the public attributes in the data set, and the related definitions are as follows.
Definition 1 (The user’s data set
[a11a12…a1na21a22…a2n⋮⋮⋱⋮am1am2…amn][b11b12…b1nb21b22…b2n⋮⋮⋱⋮bm1bm2…bmn] |
(1) |
where
Definition 2 (Disclosure-security) Given a data set
Pr(F(V)=U)Pr(F(V)=U|M(U))≤eα |
(2) |
Here,
Given the user’s data set is
T={A=EU−EYB=SU−SYC=ˆAYV−ˆAUVD=BkU−BkY |
(3) |
where
In our system, we assume that the edge server is trusted, which is used to disturb the raw data of users before updating them to the center server; the center server is “honest but curious,” which means that it may faithfully follow our proposed protocols but try to extract as much sensitive information of users as possible. Moreover, the edge server will not collude with the center server to obtain information that they don’t have access to.
In the edge perturbation mechanism, the edge server is introduced as a perturbation node to disturb sensitive information of users before reporting to the center server, which can avoid information leakage both on users’ data stored in the center server and data transmission process simultaneously. In addition, we adopt global noise to disturb the data, which can guarantee better data utility than local perturbation. The data collection model based on edge perturbation is shown in Fig.1.
In our model, users first upload raw data (RD) to their neighboring edge server, and then the edge server generates local noise (LN) using a perturbation mechanism and send it to the center server. The center server merges the local noises generated by each edge server to create global noise (GN) and returns it to each edge server. Finally, each edge server uses the global noise to disturb the user’s data and reports the disturbed data (M(RD,GN)) to the center server.
The user’s data set collected by the edge server is often sparse, and there is a certain correlation between data attributes. Therefore, we present a perturbation mechanism to add noise to the raw data. The local noise is generated through the covariance between data attributes [17], and then a global noise having global sensitivity is achieved by integrating the local noise of each edge server. It can maintain the connection between data attributes and ensure the data utility. The specific perturbation process is divided into the following three steps.
1) The generation of local noise
Supposing that each group has n users, each user submits a data set
Step 1 The regression operation is performed on
Step 2 The covariance
Step 3–5 The original local noise
Step 6 Random matrix
Step 7
Step 8 The covariance
Step 9 A new local noise variable
In Algorithm 1, the noise
Algorithm 1 Local noise generation algorithm
Input: Original data set
Output: Local noise
1:
2:
3: Do
4: Generating perturbation noise e;
5: While
6: Generating random matrix
7:
8: Calculating the covariance
9: Do
10: Calculating a new local noise
11: While
12: Return
2) The generation of global noise
Supposing that there are n edge servers, and the local noise is
O=1n∑Ci |
(4) |
Then the generated global noise O is returned to each edge server.
3) Data perturbation
Edge server receives the global noise O and disturbs the raw data set. If the Y is the sensitive attribute U after perturbation, it can be expressed as:
Y=ˆβ0+ˆβ1V+O |
(5) |
The disturbed raw data set can be expressed as:
M(D)=M(U,V)=(M(U),V)=(Y,V) |
(6) |
where,
The data utility of the proposed scheme is proved by comparing the statistical properties of the data before and after the perturbation. So we have the following Theorems 1 and 2.
Theorem 1 Using global noise
Proof Assuming that the data set
Because:
E(Di)=E(Ui,Vi)=E(ˆβi0+ˆβi1Vi)=E(ˆβi0)+E(ˆβi1Vi)E(M(Di))=E(Yi,Vi)=E(ˆβi0+ˆβi1Vi+Ci)=E(ˆβi0)+E(ˆβi1Vi)+E(Ci)=E(ˆβi0)+E(ˆβi1Vi)E(n∑i=1Di)=n∑i=1E(Di)=E(D1)+⋯+E(Dn)=E(ˆβ10+ˆβ11V1)+⋯+E(ˆβn0+ˆβn1Vn) |
(7) |
E(n∑i=1M(Di))=n∑i=1E(M(Di))=E(M(D1))+⋯+E(M(Dn))=E(ˆβ10+ˆβ11V1+C1)+…+E(ˆβn0+ˆβn1Vn+Cn)=E(ˆβ10)+E(ˆβ11V1)+E(C1)+⋯+E(ˆβn0)+E(ˆβn1Vn)+E(Cn) =E(ˆβ10)+E(ˆβ11V1)+E(ˆβ20)+E(ˆβ21V2)+⋯+E(ˆβn0)+E(ˆβn1Vn)=E(D1)+E(D2)+⋯+E(Dn)=n∑i=1E(Di)=E(n∑i=1(Di)) |
(8) |
And since
E(O)=E(1nn∑i=1Ci)=1nE(n∑i=1Ci)=1nn∑i=1E(Ci) |
(9) |
Therefore, we have the following formula:
E(n∑i=1M(Di))=n∑i=1E(M(Di))=E(M(D1))+⋯+E(M(Dn))=E(ˆβ10+ˆβ11V1+O)+⋯+E(ˆβn0+ˆβn1Vn+O)=E(ˆβ10)+E(ˆβ11V1)+E(O)+⋯+E(ˆβn0)+E(ˆβn1Vn)+E(O)=E(ˆβ10)+E(ˆβ11V1)+E(ˆβ20)+E(ˆβ21V2)+⋯+E(ˆβn0)+E(ˆβn1Vn)=E(D1)+E(D2)+⋯+E(Dn)=n∑i=1E(Di)=E(n∑i=1(Di)) |
(10) |
It can be seen from the above theorem that the mean value (
Therefore, the statistical analysis result of the perturbed data set
Theorem 2 The privacy disclosure risk of the perturbed data caused by global noise
Proof The disclosure risk is determined by the ratio of the intruder’s ability to predict sensitive data from the public data before and after perturbing sensitive data. In this paper, we adopt a typical correlation analysis to measure the data disclosure risk. It is assumed that
In order to satisfy the data disclosure risk of (2), that is
cov(U,V)=E(UV)−E(U)E(V)cov(Y,V)=E(YV)−E(Y)E(V)var(U)=E2(U)−[E(U)]2var(V)=E2(V)−[E(V)]2var(Y)=E2(Y)−[E(Y)]2 |
(11) |
In addition, we have
var(Y)=var(ˆβ0+ˆβ1V+O)=var(ˆβ1V)+var(ˆβ0+O)=ˆβ21var(V)+var(ˆβ0)+var(O)+2E[ˆβ0−E(ˆβ0)][E[O−E(O)]=ˆβ21var(V) |
(12) |
Because
var(ˆβ0)=02E[ˆβ0−E(ˆβ0)][E[O−E(O)]=0 |
(13) |
we have
ρ1ρ2=cov(U,V)√var(U)√var(V)/cov(Y,V)√var(Y)√var(V)=√var(Y)√var(V)=√ˆβ21var(V)√var(V)=ˆβ1 |
(14) |
Let the inequality
Therefore, if the correlation ratio of the data before and after perturbation is
Centralized perturbation of sensitive information is always based on one premise: a trusted third-party data collector, i.e., the guarantee that the third-party data collector will not steal or disclose sensitive information of users. However, this is unrealistic. The data collector stores all data of users, which will cause a serious threat to users’ privacy if the data collector is attacked internally or the service provider leaks users’ data for commercial interests. The edge perturbation mechanism proposed in this paper can solve the problem of centralized perturbation. In our model, the raw data is already disturbed before uploading to the center server, which can avoid the privacy leakage problem. In addition, the raw data is desensitized by edge nodes, even if one of edge nodes is leaked, it is only partial data, which has no significant impact on the overall results. Therefore, compared with centralized perturbation, edge perturbation can guarantee a higher level of data privacy.
In this section, statistical results and comparative results will be carried out to verify the utility and privacy risk of the proposed privacy protection mechanism for data collection based on edge perturbation, and the results will be compared with the local differential privacy proposed in [26].
The data used in this experiment comes from the data of medical and health care center in [22]. Among them, 500 data records are reserved, and each record retains 5 data attributes. The five attributes are: name, gender, supplementary insurance, drug purchase cost and medical cost. Among them, hidden name attribute, gender and supplementary insurance as public attribute, drug purchase cost and medical cost as sensitive attribute. We randomly divided 500 groups of data into 50 groups with 10 data in each group. The data of each group were statistically analyzed. In addition, we set the group data size to 100, 200, 300, 400 and 500, and verify the effectiveness of the proposed scheme through the sensitive attributes of medical cost and procurement cost. The algorithm is implemented by MATLAB R2016a. Considering the influence of experimental error, the experiment of each group were repeated for 5 times, and the final result was the average of 5 times.
In order to obtain the attacker’s predictive ability, we need to analyze the typical correlation of the data before and after the perturbation, and predict the value of sensitive variables by using the value of public variables, as shown in Fig.2.
The primary and secondary correlation analysis of the original data is 0.4226 and 0.1999, respectively. For the purpose of assessing the predictive ability of the perturbed data, by repeating the above method, the primary and secondary correlations is 0.4226 and 0.1999, respectively, which is the same as the results for the original data. In other words, the predictive ability of attackers after data perturbation is the same as that raw data, and their privacy risk budget is 0.
Furthermore, the statistical queries of raw data and perturbed data are analyzed, the results are shown in Table 1 and Table 2. The names in the tables are abbreviations for purchase cost (PC), medical cost (MC), gender (G), and supplemental insurance (SI), respectively. As can be seen in Tables I and II, the statistical results for the mean, standard deviation and covariance are the same for both the raw and perturbed data.
Dataset | Marginal distribution | ||
Mean value standard deviation | |||
Raw data | PC | 504.591 | 81.963 |
MC | 1228.705 | 213.731 | |
Perturbed data | PC | 504.591 | 81.963 |
MC | 1228.705 | 213.731 |
Dataset | Covariance | ||||
G | SI | PC | MC | ||
Raw data | PC | −9.140 | −10.49 | 6583.53 | 7201.16 |
MC | −26.83 | −29.55 | 7201.16 | 44767.31 | |
Perturbed data | PC | −9.140 | −10.49 | 6583.53 | 7201.16 |
MC | −26.83 | −29.55 | 7201.16 | 44767.31 |
The data utility of the proposed mechanism is validated and compared with the local differential privacy proposed in [26] under the same privacy level, as shown in Fig.3–Fig.9.
Here, the diamond point expresses the raw data, the rectangular point expresses the proposed scheme, and the triangular point expresses the local differential privacy in [26]. The abscissa represents the amount of data, and the ordinate represents the statistical results of the data. We analyze the experimental results in four aspects: mean, standard deviation, covariance, and k-order central moments.
The comparisons of mean value are shown in Fig.3 and Fig.4. It can be seen that the mean value of the proposed method is equal to the mean value of the original data in the data set between 0 and 300, while the mean value of the data perturbed by the local differential privacy method is significantly smaller than the actual value and there is some error. However, when the data set is large, the proposed method produces larger errors. Therefore, our method is more applicable to the processing of small data sets.
It can be seen from Fig.5 and Fig.6 that local differential privacy will cause errors in each result of standard deviation and cannot truly reflect the information provided by the original data. In contrast, the proposed method can guarantee that the results of standard deviation for the perturbed data set are the same as those observed from the original data in a certain range.
Meanwhile, we analyze the
The Fig.9 shows the covariance statistics of original data, proposed method and local differential privacy. We can also find that our method has obvious advantages in a certain range.
Overall, according to the experimental results, we can see that the data utility of the proposed method is better than the local differential privacy within a certain range. That is, the proposed method can better reflect the real statistical results and provide better data utility for small data sets.
Considering that the traditional data collection model in IoT is difficult to consider the privacy and utility of data, this paper proposes a new privacy protection mechanism based on the concept of edge perturbation. The basic idea is to introduce an edge server to protect users’ sensitive data. Edge perturbation can not only avoid information leakage from the center server, but also can achieve better utility than local perturbation. In addition, we propose a global noise generation algorithm for edge perturbation. The center server collects each edge noise, merges it to generate global noise, and then sends it to each edge server. The edge servers use the global noise to perturb the data, which ensures better utility for the original data and minimizes the disclosure risk. Finally, theoretical and experimental evaluations show that the proposed mechanism has privacy and accuracy and is applicable to small data sets. In future work, we would like to consider solutions for multi-server collusion attacks. While our approach is for digital data, how to handle categorical data or convert categorical data into digital data is also a direction for future research.
[1] |
M. Abrar, B. Zuhaira, and A. Anjum, “Privacy-preserving data collection for 1: M dataset,” Multimedia Tools and Applications, vol.80, no.20, pp.31335–31356, 2021. DOI: 10.1007/s11042-021-10562-3
|
[2] |
D. L. Lv and S. B. Zhu, “Achieving secure big data collection based on trust evaluation and true data discovery,” Computers & Security, vol.96, article no.101937, 2020. DOI: 10.1016/j.cose.2020.101937
|
[3] |
Q. Jiang, X. Zhang, N. Zhang, et al., “Three-factor authentication protocol using physical unclonable function for IoV,” Computer Communications, vol.173, pp.45–55, 2021. DOI: 10.1016/j.comcom.2021.03.022
|
[4] |
G. C. Zhao, Q. Jiang, X. H. Huang, et al., “Secure and usable handshake based pairing for wrist-worn smart devices on different users,” Mobile Networks and Applications, vol.26, no.6, pp.2407–2422, 2021. DOI: 10.1007/s11036-021-01781-x
|
[5] |
Q. Jiang, N. Zhang, J. B. Ni, et al., “Unified biometric privacy preserving three-factor authentication and key agreement for cloud-assisted autonomous vehicles,” IEEE Transactions on Vehicular Technology, vol.69, no.9, pp.9390–9401, 2020. DOI: 10.1109/TVT.2020.2971254
|
[6] |
B. C. M. Fung, K. Wang, R. Chen, et al., “Privacy-preserving data publishing: A survey of recent developments,” ACM Computing Surveys, vol.42, no.4, article no.14, 2010. DOI: 10.1145/1749603.1749605
|
[7] |
C. Y. Wang, D. Wang, G. A. Xu, et al., “Efficient privacy-preserving user authentication scheme with forward secrecy for industry 4.0,” Science China Information Sciences, vol.65, no.1, article no.11230, 2022. DOI: 10.1007/s11432-020-2975-6
|
[8] |
Z. P. Li, D. Wang, and E. Morais, “Quantum-safe round-optimal password authentication for mobile devices,” IEEE Transactions on Dependable and Secure Computing, vol.19, no.3, pp.1885–1899, 2022. DOI: 10.1109/TDSC.2020.3040776
|
[9] |
Y. X. Liu, M. S. Hu, X. J. Ma, et al., “A new robust data hiding method for H. 264/AVC without intra-frame distortion drift,” Neurocomputing, vol.151, pp.1076–1085, 2015. DOI: 10.1016/j.neucom.2014.03.089
|
[10] |
J. W. Kim, K. Edemacu, J. S. Kim, et al., “A survey of differential privacy-based techniques and their applicability to location-based services,” Computers & Security, vol.111, article no.102464, 2021. DOI: 10.1016/j.cose.2021.102464
|
[11] |
W. B. Fan, J. He, M. J. Guo, et al., “Privacy preserving classification on local differential privacy in data centers,” Journal of Parallel and Distributed Computing, vol.135, pp.70–82, 2020. DOI: 10.1016/j.jpdc.2019.09.009
|
[12] |
C. Xia, J. Y. Hua, W. Tong, et al., “Distributed K-means clustering guaranteeing local differential privacy,” Computers & Security, vol.90, article no.101699, 2020. DOI: 10.1016/j.cose.2019.101699
|
[13] |
M. Nasir, A. Anjum, U. Manzoor, et al., “Privacy preservation in skewed data using frequency distribution and weightage (FDW),” Journal of Medical Imaging and Health Informatics, vol.7, no.6, pp.1346–1357, 2017. DOI: 10.1166/jmihi.2017.2206
|
[14] |
M. Rodriguez-Garcia, M. Batet, and D. Sánchez, “Utility-preserving privacy protection of nominal data sets via semantic rank swapping,” Information Fusion, vol.45, pp.282–295, 2019. DOI: 10.1016/j.inffus.2018.02.008
|
[15] |
K. Mohana Prabha and P. Vidhya Saraswathi, “Suppressed K-anonymity multi-factor authentication based Schmidt-Samoa cryptography for privacy preserved data access in cloud computing,” Computer Communications, vol.158, pp.85–94, 2020. DOI: 10.1016/j.comcom.2020.04.057
|
[16] |
S. C. Zhang, X. L. Li, M. F. Zong, et al., “Learning k for KNN classification,” ACM Transactions on Intelligent Systems and Technology, vol.8, no.3, article no.43, 2017. DOI: 10.1145/2990508
|
[17] |
B. B. Mehta and U. P. Rao, “Improved l-diversity: scalable anonymization approach for privacy preserving big data publishing,” Journal of King Saud University - Computer and Information Sciences, vol.34, no.4, pp.1423–1430, 2022. DOI: 10.1016/j.jksuci.2019.08.006
|
[18] |
Y. W. Zhou, B. Yang, and X. Wang, “Direct anonymous authentication protocol for roaming services based on fuzzy identity,” Journal of Software, vol.29, no.12, pp.3820–3836, 2018. (in Chinese) DOI: 10.13328/j.cnki.jos.005302
|
[19] |
A. Y. Ye, J. L. Jin, Z. J. Yang, et al., “Evolutionary game analysis on competition strategy choice of application providers,” Concurrency and Computation: Practice and Experience, vol.33, no.8, article no.e5446, 2021. DOI: 10.1002/cpe.5446
|
[20] |
M. Minea and C. Dumitescu, “Enhanced public transport management employing AI and anonymous data collection,” in Proceedings of the 23rd International Conference on Circuits, Systems, Communications and Computers (CSCC 2019), Marathon Beach, Athens, article no.03006, 2019.
|
[21] |
Z. P. Zhou and Z. C. Li, “Data anonymous collection protocol without trusted third party,” Journal of Electronics Information Technology, vol.41, no.6, pp.1442–1449, 2019. (in Chinese) DOI: 10.11999/JEIT180595
|
[22] |
K. Muralidhar and R. Sarathy, “An enhanced data perturbation approach for small data sets,” Decision Sciences, vol.36, no.3, pp.513–529, 2005. DOI: 10.1111/j.1540-5414.2005.00082.x
|
[23] |
Y. Tian, X. Li, A. K. Sangaiah, et al., “Privacy-preserving scheme in social participatory sensing based on secure multi-party cooperation,” Computer Communications, vol.119, pp.167–178, 2018. DOI: 10.1016/j.comcom.2017.10.007
|
[24] |
Y. J. Wang, Z. P. Cai, Z. Y. Chi, et al., “A differentially k-anonymity-based location privacy-preserving for mobile crowdsourcing systems,” Procedia Computer Science, vol.129, pp.28–34, 2018. DOI: 10.1016/j.procs.2018.03.040
|
[25] |
D. L. Lv and S. B. Zhu, “Correlated differential privacy protection for big data,” in Proceedings of 2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA), Krakow, Poland, pp.1011–1018, 2018.
|
[26] |
J. W. Kim and B. Jang, “Workload-aware indoor positioning data collection via local differential privacy,” IEEE Communications Letters, vol.23, no.8, pp.1352–1356, 2019. DOI: 10.1109/LCOMM.2019.2922963
|
1. | Hu, J., Zhang, L., Lu, G. et al. Efficient Sensitive Data Identification Using Multi-Protocol Automatic Parsing and Vector Classification Based on Feature Fusion. Informatica (Slovenia), 2024, 48(22): 55-62. DOI:10.31449/inf.v48i22.6758 |
2. | Liu, J., Chen, Y., Ren, Q. et al. Joint data augmentations for automated graph contrastive learning and forecasting. Complex and Intelligent Systems, 2024, 10(5): 6481-6490. DOI:10.1007/s40747-024-01491-3 |
3. | Jiang, Y., Ma, B., Wang, X. et al. Blockchained Federated Learning for Internet of Things: A Comprehensive Survey. ACM Computing Surveys, 2024, 56(10): 258. DOI:10.1145/3659099 |
4. | Yan, Y., Wang, S., Sun, F. et al. Personalized Federated Learning With Multiview Geometry Structure. IEEE Internet of Things Journal, 2024, 11(24): 39346-39360. DOI:10.1109/JIOT.2024.3425435 |
5. | Li, J., Hou, N., Zhang, G. et al. Efficient Conditional Privacy-Preserving Authentication Scheme for Safety Warning System in Edge-Assisted Internet of Things. Mathematics, 2023, 11(18): 3869. DOI:10.3390/math11183869 |
6. | Li, J., Liu, Y., Li, S. et al. Self-C2AD: Enhancing CA Auditing in IoT with Self-Enforcement Based on an SM2 Signature Algorithm. Mathematics, 2023, 11(18): 3887. DOI:10.3390/math11183887 |
7. | Deng, X., Wang, L., Gui, J. et al. A review of 6G autonomous intelligent transportation systems: Mechanisms, applications and challenges. Journal of Systems Architecture, 2023. DOI:10.1016/j.sysarc.2023.102929 |
8. | Wang, G., Guo, J., Yang, H. Water Quality Monitoring and Sustainable Operation ManagementofMarine Ranches Based on Sensors and Image Enhancement. Proceedings of SPIE - The International Society for Optical Engineering, 2023. DOI:10.1117/12.3007416 |
9. | Chen, Q., Wu, L., Jiang, C. ES-PPDA: an efficient and secure privacy-protected data aggregation scheme in the IoT with an edge-based XaaS architecture. Journal of Cloud Computing, 2022, 11(1): 20. DOI:10.1186/s13677-022-00295-5 |
Dataset | Marginal distribution | ||
Mean value standard deviation | |||
Raw data | PC | 504.591 | 81.963 |
MC | 1228.705 | 213.731 | |
Perturbed data | PC | 504.591 | 81.963 |
MC | 1228.705 | 213.731 |
Dataset | Covariance | ||||
G | SI | PC | MC | ||
Raw data | PC | −9.140 | −10.49 | 6583.53 | 7201.16 |
MC | −26.83 | −29.55 | 7201.16 | 44767.31 | |
Perturbed data | PC | −9.140 | −10.49 | 6583.53 | 7201.16 |
MC | −26.83 | −29.55 | 7201.16 | 44767.31 |