Sanil, Ashish (2003), Pointers from Research on Data Confidentiality and Data Quality, Computing Science and Statistics, 35, I2003Proceedings/SanilAshish/SanilAshish.presentation.pdf
Issues of Data Confidentiality (DC) - making information available (possibly by releasing suitably modified/restricted data sets) but at the same time protecting the confidentiality of data subjects - and Data Quality (DQ) have been enduring concerns of Federal statistical agencies and other agencies that regularly disseminate data to the public. Consequently, there exists a body of established techniques for addressing DC and DQ problems. There are dual aspects to the DC and DQ problems that are relevant to homeland security issues. In case of DC, in order to apply disclosure limitation methods to the data, one has to understand how a possibly malicious user could identify individual in the released data. A significant portion of DQ efforts involve the detection of anomalous records in the data. Both these families of techniques can be useful in unmasking individuals in databases with noisy data, and for identifying aberrant records in the database. Lessons gleaned from NISS projects on DC and DQ will be presented as illustrations.