Thanks for this vid, good to see a 'real' example. The nice thing about a linear model is that you can get a result with multiplication alone, does the same apply for categorical regression?
If Party A and B have a similar data set and use case sure. For example if you wanted to use encrypted data from 10 organizations to predict turnover for an 11th company, it can be done as long as the features are similar. If not then your data for your prediction model don’t align to your trained data in which case it’s not recommended
if the user sends the same payload all over again, will the encrypted version of it "rotate" or will it always look the same? in the second case one could easily guess the age being sent using statistical analysis. If the second is true, is that also an issue in Full HME? If the second is true, would it be "enough" for the user to create another key pair and start encrypting the data using the new pair, so it looks different ? I also assume that the server cannot use the client's public key to encrypt the data right? otherwise also a brute force approach would do the job, when trying to guess the distribution of the data that's coming in
Still having trouble to wrap my head around this. If my ML model takes in IOT readings (i.e. vibration data), and use it for anomaly detection. If I send the encrypted data to the ML model, pretty sure it'll break the ML model since the input data is for sure significantly different from the training data. In this case, does that mean the ML model would have to use encrypted data to do the training?
No. I think somewhere you misunderstood what is happening. The weights of the model are changed in accordance with the public key which is sent with the data. Pls watch the video again carefully ☺️
nice video sir... I tried executing the code in my laptop.... model creation and coefficients are also generated.. in cust.py code both public and private keys are generated, cust.json file is created. But "data.json" file when i opened it is blank... but in this video you are showing the encrypted form of data, but i getting it empty... I am getting error as "TypeError: Object of type function is not JSON serializable". please help me out in this regard, sir...
Hey, this is my compute data function and the error is: 'EncryptedNumber' object is not subscriptable. def computeData(): pk = data['public_key'] pubkey = paillier.PaillierPublicKey(n=int(pk['n'])) for x in data['values']: enc_nums_rec = paillier.EncryptedNumber(pubkey, int(x[0], int(x[1]))) # extracting public key from the data for i in range(len(list(mycoef))): results = sum(mycoef[i]*enc_nums_rec[i]) return results, pubkey If someone can help then it will be appreciated.
great question. im not sure its possible just yet to my understanding but maybe one day. First, I would question the practicality of it because when you build ML models you need to have a basic understanding of your data. If its in cypher text you may not even know what you're working with tbh so you're result will be compromised. . Second, if we look at it theoretically there are a lot of complexities especially if you're dealing with a classification model. In either scenerio your feature set needs to be in cypher text which all need to be converted into some form of numerical representation for any ML model to work. Given this encryption is one way, getting any meaningful representation would be difficult and probably require excess compute power to try to decrypt it to a format that may make sense to an ML model tho you would still have higher loss in my opinion than if it wasn't encrypted. Unless you have a lot of infrastructure resources, it may be too premature to delve in this space.
Yes that’s what this video displays albeit on limited data and only regression. Full HME is still being worked on and will have more application in future
Amazing video, easy to try and understand even for 1-day Python programmers with their own data.
Thanks you
A perfect start for this domain
This is interesting and will definitely delve into this
Have you applied pyfhel library anywhere?
Thanks for this vid, good to see a 'real' example. The nice thing about a linear model is that you can get a result with multiplication alone, does the same apply for categorical regression?
Hi, is there any other video about this topic? especially the practical type. Do you know which company used such a process in this video?
It would be great if you could do an example with functional encryption as well.
What do you mean?
@@SATSifaction I think FE hides the function being applied to the data... I think...
Hi will this work for BERT based models?
Great work sir, please is there a possibility for me to have the codes and the dataset so i can try to implement it on my laptop? thanks while waiting
Can you use encrypted data from party A to train a ML model such that the ML model can be used for inferencing for party B?
If Party A and B have a similar data set and use case sure. For example if you wanted to use encrypted data from 10 organizations to predict turnover for an 11th company, it can be done as long as the features are similar. If not then your data for your prediction model don’t align to your trained data in which case it’s not recommended
if the user sends the same payload all over again, will the encrypted version of it "rotate" or will it always look the same?
in the second case one could easily guess the age being sent using statistical analysis.
If the second is true, is that also an issue in Full HME?
If the second is true, would it be "enough" for the user to create another key pair and start encrypting the data using the new pair, so it looks different ?
I also assume that the server cannot use the client's public key to encrypt the data right? otherwise also a brute force approach would do the job, when trying to guess the distribution of the data that's coming in
btw good video, i am just new to the matter thats why so many questions.
BTW the py-seal project looks archived on github
Most libraries should 'salt' the encrypted value, so that same clearText in != same cipherText
Still having trouble to wrap my head around this. If my ML model takes in IOT readings (i.e. vibration data), and use it for anomaly detection. If I send the encrypted data to the ML model, pretty sure it'll break the ML model since the input data is for sure significantly different from the training data. In this case, does that mean the ML model would have to use encrypted data to do the training?
No. I think somewhere you misunderstood what is happening. The weights of the model are changed in accordance with the public key which is sent with the data. Pls watch the video again carefully ☺️
@@DamanArora1209 What I can see is we are not changing the model coefficients while multiplying them with encrypted data
@@udayhanmante2402 That's how this type of PARTIAL homomorphic encryption works, remember D ( E(A) * scalar ) = A * scalar
nice video sir... I tried executing the code in my laptop.... model creation and coefficients are also generated.. in cust.py code both public and private keys are generated, cust.json file is created. But "data.json" file when i opened it is blank... but in this video you are showing the encrypted form of data, but i getting it empty...
I am getting error as "TypeError: Object of type function is not JSON serializable". please help me out in this regard, sir...
Hey, this is my compute data function and the error is: 'EncryptedNumber' object is not subscriptable.
def computeData():
pk = data['public_key']
pubkey = paillier.PaillierPublicKey(n=int(pk['n']))
for x in data['values']:
enc_nums_rec = paillier.EncryptedNumber(pubkey, int(x[0], int(x[1]))) # extracting public key from the data
for i in range(len(list(mycoef))):
results = sum(mycoef[i]*enc_nums_rec[i])
return results, pubkey
If someone can help then it will be appreciated.
You are building model on original data and then encrypting the model. Is it possible to build model on encrypted data?
great question. im not sure its possible just yet to my understanding but maybe one day. First, I would question the practicality of it because when you build ML models you need to have a basic understanding of your data. If its in cypher text you may not even know what you're working with tbh so you're result will be compromised. . Second, if we look at it theoretically there are a lot of complexities especially if you're dealing with a classification model. In either scenerio your feature set needs to be in cypher text which all need to be converted into some form of numerical representation for any ML model to work. Given this encryption is one way, getting any meaningful representation would be difficult and probably require excess compute power to try to decrypt it to a format that may make sense to an ML model tho you would still have higher loss in my opinion than if it wasn't encrypted. Unless you have a lot of infrastructure resources, it may be too premature to delve in this space.
Yes, it is!
Re: data distribution, yes, that is a potential problem but it kind of depends on the problem.
Is it possible to train the model on encrypted data?
Yes that’s what this video displays albeit on limited data and only regression. Full HME is still being worked on and will have more application in future
@@SATSifaction Hello. The video shows that the regression model is trained on unencripted data. myCoef = LinMdel.getCoef() use normal data. Thanks
Git Missing :/
C'ant you hellp me please