This page includes more information on the synthesized medical dataset, the privacy policies and bidirectional transformation implementation used in the PERCOM19 paper “POET: Privacy on the Edge with Bidirectional Data Transformations”. POET has been developed by Nianyu Li (Peking University).

Process Overview

The figure below provides a birds-eye view of our approach to privacy on the edge. At design time, two tasks are performed: i) privacy requirements are encoded in privacy policies, using the established privacy formalism of P-RBAC, and ii) source data are interfaced in order to be compatible with our privacy framework. Output artifacts of those two design time tasks constitute the input to a transformation engine.

Privacy on the Edge

Privacy-Aware Bidirectional Transformations

In this paper, we adopt the putback-based bidirectional programming language BiGUL, which is implemented as an embedded domain-specific language in Haskell. The implementation of bidirectional transformation is available as below. To run it, one has to be sure that a Haskell environment exists and is functional. BiGUL has been released in Hackage, and the latest version can be installed using Cabal in the usual way, by executing the following in the command line: cabal install BiGUL. Subsequently, the following archives contain the POET runtimes.

We consider two versions of data formats, tabular format and JSON. The prototypical data preparation stages for the two formats, BX engine and Policy Govenor are implemented in the above archives. Firstly, we can run the following commands to generate views introduced in the paper. The source in the experiment of our paper could be any synthesized data in a computational node. The result.txt will record source and view, and the generated view will be stored and then can be used to update another source in its connecting node according to the deployment.

//entering the file dealing with BX
  >ghci EdgeBX.hs    
  //generating view from the source defined in the procedure
  //updating source with changed view, both defined in the procedure
  //measure the view generation time required for the BX engine 

Deployment at runtime

Medical Information Privacy Case Study

The importance of patient privacy has been thoroughly emphasized by governmental resources such as the HIPAA Privacy Rule. Therefore, we adopt virtual repositories that bear a high degree of resemblance of real hospital databases. Below are the files each represent different dimensions of information, including Patient (i.e., patient personal information), Admissions, Diagnoses as well as Labs (i.e., records of medical tests in a clinical laboratory). The dataset is found in references [24,25] of the paper.

Synthesized medical dataset

We use synthesized data which are put in computational nodes. Typically, in the case study, Doctor’s Office is associated with patient records, diagnoses and medical tests from the labs; Hospital contain data for patients, past admissions as well as diagnoses. These synthesized medical dataset are in two form, JSON and Tabular format.

Doctor’s Office (DO) a practitioner’s office makes use of available data for a patient –such as medical tests from a lab for diagnostic reasons, as well as keeping patients’ personal information. A doctor may issue diagnoses, and all this information may be synchronized with another doctor or hospital in case of a referral or joint treatment. In our case study we assume a DO to be associated with 0.3k patient records, including their medical tests and diagnoses. Data DO

Medical Lab (ML) a clinical laboratory carries out diagnostic tests on visiting patients – in our case study we assume 2k records of medical tests to be in an ML node. Data ML

Hospital (HL) database facilities in a hospital contain all data for 1k patients including their personal information from doctor’s offices, 5k past admissions as well as 5k diagnoses. Such data are often used for patient management and hospital organization. Moreover, hospitals in a region may synchronize data with each other due to patient mobility and specialized care. Data HL

University (UV): research in a university setting often makes use of certain medical data sourced from hospital databases (such as diagnoses of a disease) for scientific reasons. We consider 5k such diagnoses to be associated with a UV. Data UV

Patient’s residence (PR) monitoring of vital medical information may be performed on a patient’s home, making use of recent IoT developments such as Body Area Networks [22]. Medical sensor data produced by small sensors are often used for live monitoring and live diagnosis reasons by medical practitioners. We assume 100 records of medical tests from sensors to be associated with a PR node. Data PR

Privacy Policies

Our instantiation of P-RBAC for the edge has the following constituents. A permission specifies what action can be performed on which data object. P-RBAC extends this notion of a permission by adding privacy-related attributes to it, such as purpose, condition and obligations. The purpose binds a permission to a range of duties; for example, sharing purposes may entail different permissions than storage. A condition specifies under which circumstances a permission can be granted. Obligations denote a set of operations which need to be performed whenever a permission has been granted, such as adding a log entry.

A Privacy Governor is responsible for implementing privacy policies, which are written as (r, ((a, d), p, c, ob)). The internal representations of the policies considered in the medical case study are below.

Pol (Rol "DO") (DP ("Write","diagnosis")) (Pur "MedicalCare") (Con (("",""),"")) (ONull)
Pol (Rol "ML") (DP ("Write","labs")) (Pur "MedicalCare") (Con (("LabDateTime",""),"NEqual")) (ONull)
Pol (Rol "HL") (DP ("Read","patients")) (Pur "Statistics") (Con (("",""),"")) (ONull)
Pol (Rol "UV") (DP ("Read","diagnosis")) (Pur "Research") (Con (("DiagnosisCore","Q21"),"Equal")) (ONull)
Pol (Rol "PR") (DP ("Write","labs")) (Pur "Storage") (Con (("",""),"")) (Obl "notify")