Machine Learning with Encrypted Data | Homomorphic Encryption
Вставка
- Опубліковано 29 чер 2024
- The world is changing and privacy is becoming a huge concern. The area of machine learning on encrypted data is booming and expected to grow significantly over the next 5 years. If you want to stay ahead of the curve then get educated. Here is a video to get you started
In this video we will build a machine learning model and apply it to encrypted data using homomorphic encryption using the Paillier crypto system .
Code can be found on my github: github.com/satssehgal/Homomor...
👉 Patreon: patreon.com/SATSifaction
👉 Facebook Group: / theaiwarriors
👉 Instagram: @theaiwarriors
👉 Corporate Training and Up skilling: levers.ai
Netfirms (Affiliate) - bit.ly/2KdJ4Dp
Linode Server - bit.ly/2XpqGi9
Bluehost (Affiliate) - bit.ly/2GxxBh1
PythonAnywhere (Affiliate) - bit.ly/2kWORVe
Heroku - www.heroku.co
NordVPN (Affiliate) - bit.ly/2W87je0
✅ Here is a link to my python for beginners, master python course: bit.ly/2HIZS42 - Навчання та стиль
Amazing video, easy to try and understand even for 1-day Python programmers with their own data.
Thanks you
This is interesting and will definitely delve into this
A perfect start for this domain
Thanks for this vid, good to see a 'real' example. The nice thing about a linear model is that you can get a result with multiplication alone, does the same apply for categorical regression?
Great work sir, please is there a possibility for me to have the codes and the dataset so i can try to implement it on my laptop? thanks while waiting
Hi, is there any other video about this topic? especially the practical type. Do you know which company used such a process in this video?
Have you applied pyfhel library anywhere?
nice video sir... I tried executing the code in my laptop.... model creation and coefficients are also generated.. in cust.py code both public and private keys are generated, cust.json file is created. But "data.json" file when i opened it is blank... but in this video you are showing the encrypted form of data, but i getting it empty...
I am getting error as "TypeError: Object of type function is not JSON serializable". please help me out in this regard, sir...
Can you use encrypted data from party A to train a ML model such that the ML model can be used for inferencing for party B?
If Party A and B have a similar data set and use case sure. For example if you wanted to use encrypted data from 10 organizations to predict turnover for an 11th company, it can be done as long as the features are similar. If not then your data for your prediction model don’t align to your trained data in which case it’s not recommended
It would be great if you could do an example with functional encryption as well.
What do you mean?
@@SATSifaction I think FE hides the function being applied to the data... I think...
Hi will this work for BERT based models?
Still having trouble to wrap my head around this. If my ML model takes in IOT readings (i.e. vibration data), and use it for anomaly detection. If I send the encrypted data to the ML model, pretty sure it'll break the ML model since the input data is for sure significantly different from the training data. In this case, does that mean the ML model would have to use encrypted data to do the training?
same thoughts
No. I think somewhere you misunderstood what is happening. The weights of the model are changed in accordance with the public key which is sent with the data. Pls watch the video again carefully ☺️
@@DamanArora1209 What I can see is we are not changing the model coefficients while multiplying them with encrypted data
@@udayhanmante2402 That's how this type of PARTIAL homomorphic encryption works, remember D ( E(A) * scalar ) = A * scalar
if the user sends the same payload all over again, will the encrypted version of it "rotate" or will it always look the same?
in the second case one could easily guess the age being sent using statistical analysis.
If the second is true, is that also an issue in Full HME?
If the second is true, would it be "enough" for the user to create another key pair and start encrypting the data using the new pair, so it looks different ?
I also assume that the server cannot use the client's public key to encrypt the data right? otherwise also a brute force approach would do the job, when trying to guess the distribution of the data that's coming in
btw good video, i am just new to the matter thats why so many questions.
BTW the py-seal project looks archived on github
Most libraries should 'salt' the encrypted value, so that same clearText in != same cipherText
You are building model on original data and then encrypting the model. Is it possible to build model on encrypted data?
great question. im not sure its possible just yet to my understanding but maybe one day. First, I would question the practicality of it because when you build ML models you need to have a basic understanding of your data. If its in cypher text you may not even know what you're working with tbh so you're result will be compromised. . Second, if we look at it theoretically there are a lot of complexities especially if you're dealing with a classification model. In either scenerio your feature set needs to be in cypher text which all need to be converted into some form of numerical representation for any ML model to work. Given this encryption is one way, getting any meaningful representation would be difficult and probably require excess compute power to try to decrypt it to a format that may make sense to an ML model tho you would still have higher loss in my opinion than if it wasn't encrypted. Unless you have a lot of infrastructure resources, it may be too premature to delve in this space.
Yes, it is!
Re: data distribution, yes, that is a potential problem but it kind of depends on the problem.
Hey, this is my compute data function and the error is: 'EncryptedNumber' object is not subscriptable.
def computeData():
pk = data['public_key']
pubkey = paillier.PaillierPublicKey(n=int(pk['n']))
for x in data['values']:
enc_nums_rec = paillier.EncryptedNumber(pubkey, int(x[0], int(x[1]))) # extracting public key from the data
for i in range(len(list(mycoef))):
results = sum(mycoef[i]*enc_nums_rec[i])
return results, pubkey
If someone can help then it will be appreciated.
Is it possible to train the model on encrypted data?
Yes that’s what this video displays albeit on limited data and only regression. Full HME is still being worked on and will have more application in future
@@SATSifaction Hello. The video shows that the regression model is trained on unencripted data. myCoef = LinMdel.getCoef() use normal data. Thanks
Git Missing :/
C'ant you hellp me please