Logistic regression over encrypted data from fully homomorphic encryption

Abstract Background One of the tasks in the 2017 iDASH secure genome analysis competition was to enable training of logistic regression models over encrypted genomic data. More precisely, given a list of approximately 1500 patient records, each with 18 binary features containing information on speci...

Full description

Saved in:
Bibliographic Details
Main Authors: Hao Chen (Author), Ran Gilad-Bachrach (Author), Kyoohyung Han (Author), Zhicong Huang (Author), Amir Jalali (Author), Kim Laine (Author), Kristin Lauter (Author)
Format: Book
Published: BMC, 2018-10-01T00:00:00Z.
Subjects:
Online Access:Connect to this object online.
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 doaj_cd2455da1f2944fbba17d2fb9d7c1c20
042 |a dc 
100 1 0 |a Hao Chen  |e author 
700 1 0 |a Ran Gilad-Bachrach  |e author 
700 1 0 |a Kyoohyung Han  |e author 
700 1 0 |a Zhicong Huang  |e author 
700 1 0 |a Amir Jalali  |e author 
700 1 0 |a Kim Laine  |e author 
700 1 0 |a Kristin Lauter  |e author 
245 0 0 |a Logistic regression over encrypted data from fully homomorphic encryption 
260 |b BMC,   |c 2018-10-01T00:00:00Z. 
500 |a 10.1186/s12920-018-0397-z 
500 |a 1755-8794 
520 |a Abstract Background One of the tasks in the 2017 iDASH secure genome analysis competition was to enable training of logistic regression models over encrypted genomic data. More precisely, given a list of approximately 1500 patient records, each with 18 binary features containing information on specific mutations, the idea was for the data holder to encrypt the records using homomorphic encryption, and send them to an untrusted cloud for storage. The cloud could then homomorphically apply a training algorithm on the encrypted data to obtain an encrypted logistic regression model, which can be sent to the data holder for decryption. In this way, the data holder could successfully outsource the training process without revealing either her sensitive data, or the trained model, to the cloud. Methods Our solution to this problem has several novelties: we use a multi-bit plaintext space in fully homomorphic encryption together with fixed point number encoding; we combine bootstrapping in fully homomorphic encryption with a scaling operation in fixed point arithmetic; we use a minimax polynomial approximation to the sigmoid function and the 1-bit gradient descent method to reduce the plaintext growth in the training process. Results Our algorithm for training over encrypted data takes 0.4-3.2 hours per iteration of gradient descent. Conclusions We demonstrate the feasibility but high computational cost of training over encrypted data. On the other hand, our method can guarantee the highest level of data privacy in critical applications. 
546 |a EN 
690 |a Cryptography 
690 |a Homomorphic encryption 
690 |a Logistic regression 
690 |a Internal medicine 
690 |a RC31-1245 
690 |a Genetics 
690 |a QH426-470 
655 7 |a article  |2 local 
786 0 |n BMC Medical Genomics, Vol 11, Iss S4, Pp 3-12 (2018) 
787 0 |n http://link.springer.com/article/10.1186/s12920-018-0397-z 
787 0 |n https://doaj.org/toc/1755-8794 
856 4 1 |u https://doaj.org/article/cd2455da1f2944fbba17d2fb9d7c1c20  |z Connect to this object online.