With the rise in cloud computing due to easily accessible storage and high-performance computing, the adoption of machine learning exploded in the last decade. The output of these algorithms helps in decision-making.
Recommendation system-based machine learning is a widely accepted application of it that is used in almost every social media, e-commerce platform, and even in search engines. While it is advantageous, security experts warn of adversarial attacks engineered to misuse the technology.
The attackers will promote or share, like or dislike certain products. An attacker can craft users with recommended videos, malicious apps, or fake accounts. Such algorithmic manipulations are used for disinformation, phishing, changing public opinion, promoting unwanted content or product, and demeaning individuals or brands.
Why do these attacks happen in the first place?
Unlike humans, machines try to learn through correlations present in the test data without any logical relation between the features. It will fine-tune its parameters on training data which might not be a logical one. For example, a machine learns that images with large pastures contain sheep. So rather than identifying the sheep, it starts looking for pastures. This characteristic is the root cause of the malfunctioning of technology.
What is Data Poisoning?
It is corrupting the machine learning model’s training data. The polluting of data affects the model’s ability to give correct output predictions. Its impacts are:
- Inferring confidential information about training data.
- Misleading the model by providing false inputs to evade correct classification.
- Reverse-engineering the model to replicate and analyze it.
Adversarial attack VS Data Poisoning
Data poisoning occurs either in black box scenarios where users’ feedback has been tampered with or in white box scenarios where attackers can access the model and the training data.
Adversarial Attack | Data Poisoning |
It targets an already trained machine learning model. | It is a persistent attack that corrupts the training data of the model. |
It involves small addition of noise data to produce false results. | It is implanting of triggering correlations between features to get the polluted accepted as training data. |
It hijacks the model’s prediction or classification. | The attacker tries to get his data accepted as training data leading to the setting of the wrong parameters that classify the data. |
The deadliest TrojanNet data-poisoning method
It is the poisoning of a machine learning model with tiny patches of pixels and little computational power. It doesn’t modify the learning model. It creates small artificial neural networks to detect a series of small patches. The TrojanNet neural network and target model are encompassed together. The attacker distributes the wrapped model.
It is deadliest because of the following reasons:
- Training patch detector network is fast and fewer computational resources are required.
- It doesn’t need to access the original model and is compatible enough to be used by many AI algorithms.
- It doesn’t affect the performance as it is not easy to detect. It can gain knowledge of various triggers on which training data is built.
Preventive measures
It’s not an easy task to fix the data poisoning problem as it happens over time and the number of training cycles. It makes it hard to determine the actual shift in the accuracy of the output.
A. Machine Unlearning
Undoing the poisoning effect might not be feasible. It involves historical analysis of input to find malicious data samples and isolate them. It is followed by restoring the model to before attack state. These efforts increase many folds for a large volume of data.
An alternative to cleaning the malicious data is machine unlearning. It is the removal of customized weights and their effects that will restore the model to its initial stage. Practical solutions to machine unlearning are still a matter of research. So the current solution in hand is to retain good data that can be an expensive and unattainable task.
B. Simple prevention & detection techniques
Techniques like data validity checking, regression testing, and manual moderation will help in detecting anomalies. Another way is to restrict users to provide a limited set of inputs. A small group of users or accounts must not contribute to a large chunk of training data.
Another method will be to compare current classifiers with the previous ones. Data poisoning is a severe case of data drifting. Whether data poisoning or a bad batch of data, it needs to be fixed. Large cloud service providers are researching tools to detect a change in organizational data and model performances. Azure Monitor and Amazon SageMaker are such tools.
Restricting accessfor both data models and training data can be another way to deal with it. Enabling logging and using data and file versioning will also amp up machine learning defenses.
C. Think like an Attacker
Developers should think of ways to attack their model to build a defense against such attacks. It will highlight the loopholes and remedies even before the actual attack. It will help in building mechanisms that discard data points that look like poisoning. It will also help in identifying when a model is the most vulnerable to the attack.
Conclusion
It is critical to safeguard the created machine learning model. Proactive precautionary methods will also be much more feasible than the cure of data poisoning. Share your thoughts with us at contact.us@virtuetechinc.com