PORTIA Workshop
on Sensitive Data in Medical,
Financial, and Content-Distribution
Systems
Abstracts
Alessandro Acquisti (Carnegie Mellon
University)
Privacy, Rationality, and the
Economics of Immediate Gratification
Dichotomies between privacy attitudes and behavior have been noted in
the literature but not yet fully explained. We apply lessons from the
research on behavioral economics to understand the individual decision
making process with respect to privacy in electronic commerce. We show
that it is unrealistic to expect individual rationality in this
context. Models of incomplete information, bounded rationality, and
immediate gratification offer more realistic descriptions of the
decision process and are more consistent with currently available data.
In particular, we present a model that shows why individuals who may
genuinely want to protect their privacy might not do so because of
psychological distortions well documented in the behavioral literature.
The model shows that these distortions may affect not only 'naive'
individuals but also 'sophisticated' ones, and that this may occur also
when individuals perceive the risks from not protecting their privacy
as significant. Finally, we present preliminary evidence from an
ongoing series of surveys and experiments aimed at testing the model's
predictions.
Gagan
Aggarwal, Tomas
Feder, Krishnaram
Kenthapadi, Rajeev
Motwani, Rina Panigrahy, Dilys Thomas, and An Zhu (Stanford University)
Anonymizing Tables for
Privacy Protection
Download PDF
here.
Carol Coye Benson
(Glenbrook Partners)
Preventing Identity Theft: Consumer
Credit Files, Banks and Privacy
The current credit-granting infrastructure of is one of the great
drivers of the American economic engine, and the envy of many other
countries. This same infrastructure, unfortunately, is also the
unwitting enabler of identity theft. Stopping this crime will require
"locking down" the infrastructure and giving consumers more control
over their credit
files. The challenge is in authenticating the consumer. Banks may play
a
critical role in resolving this issue.
Joan
Feigenbaum (Yale University)
Are "Trusted Systems"
Important for Privacy Protection?
Trusted-platform initiatives such as Microsoft's Next-Generation
Secure-Computing Base and the industry-wide Trusted Computing Group
project are the subject of signicant research and development now. The
goal of these initiatives is to change a fundamental fact about
networked, general-purpose computers that is often viewed as a barrier
to security: Once data are sent from one machine to another, the sender
loses control over them. Trusted-platform designs offer hardware-based,
cryptographic support for proofs that a potential receiver's machine is
running an approved software stack. By making such proofs prerequisites
for the transfer of sensitive data, owners of these data can ensure
that only authorized applications will be run and only authorized
actions will be taken by users.
The best publicized motivation for this type of "remote control" of
networked computers is copyright enforcement for entertainment content,
but many people have claimed that they are much more widely applicable.
In particular, the claim is often made that, in application domains in
which sensitive data abound, such as healthcare and finance, data
protection would be greatly aided by widespread adoption of trusted
systems.
The purpose of this talk is to examine the validity of this claim. Is
circumvention of data-protection systems (the type of attack that might
be thwarted by trusted systems) a significant barrier to secure
information systems in healthcare and finance, or are there more
mundane barriers that are actually more significant, to wit:
- Many sensitive data objects are small (e.g., names,
social-security numbers, or one-bit answers to medical tests) and hence
easy to transmit via low-tech channels such as phone calls;
- Regulatory regimes (e.g., HIPAA or Graham-Leach-Bliley) are
non-deterministic and, more generally, hard to automate;
- Critical personnel are poorly trained, and enterprises often lack
proper (human) procedures that must complement information systems in
order for end-to-end handling of data to go smoothly;
- Many regulatory regimes *permit*
data flows that data subjects would consider to be "leaks"!
The talk will conclude with an attempt to characterize application
domains in which trusted systems could be most helpful in secure-policy
enforcement.
Robert Grimm (New
York University)
Security Challenges for Rich-Media
Educational Environments
Medicine is undergoing a major and growing chasm between scientific
knowledge and medical practice. On one side, rapid advances in
molecular biology are reshaping medical science. On the other side,
managed care has resulted in drastically reduced lengths-of-stay in
hospitals and a general compartmentalization of medical practice. As a
result, it is becoming increasingly difficult to train physicians that
can provide state-of-the-art medical care, as medical practitioners
cannot keep up with the rapidly changing basic sciences, do not have
enough context to make appropriate diagnoses, and may rely on out-dated
procedures or drug regimens.
The premise of the Infrastructure for Rich-Media Educational
Environments (IRMEE) project at New York University is that a
sustainable solution requires the integration of medical knowledge
across specializations, between theory and practice, and across
geographical boundaries and time. The chosen approach is to create a
web-based rich-media environment that (1) provides ubiquitous and
lifelong access to educational and scientific materials, (2) structures
educational content along narrative lines to re-establish missing
context, and (3) fosters a community of students and practitioners not
bound by geography. Experiences at NYU s medical school with a set of
prototypes support the general approach, demonstrating that rich-media
educational environments have advantages over textbooks, educational
videos, and lectures alike.
Unfortunately, the straight-forward multi-tier web architecture used
for these prototypes has serious scalability constraints and does not
provide an adequate basis for realizing the larger vision of IRMEE. To
overcome this major deficiency, we are building a more scalable content
delivery infrastructure. The goal is to combine the usability of
familiar web content management systems with the scalability of
peer-to-peer content distribution networks (CDNs) built on distributed
hash tables. We aim to achieve this goal by allowing for the execution
of application-specific services, which are expressed through scripts,
within the content distribution network instead of only on the server.
Our architecture leaves both clients and servers unchanged, thus
letting us track any advances in web functionality. Furthermore, its
scripting-based programming model is already familiar to web
developers, thus significantly reducing the barrier to entry in
developing applications.
While we believe that a scripting-enhanced CDN provides an appropriate
solution for scaling IRMEE, our architecture also raises two important
security challenges. First, since the CDN is implemented as a
peer-to-peer system, content integrity becomes an important issue.
Without additional safeguards, CDN nodes can modify or replace content
with their own, arbitrary versions. Some replaced content may be
obvious consider spam-like advertisements but other content may be
considerably less obvious and consequently more dangerous consider
falsified medical research reports. To make matters worse, established
solutions for ensuring content integrity, such as cryptographic hashes,
are ineffective in our architecture, as scripting-enabled CDN nodes, by
definition, may modify or even create content.
Second, some content, such as students contributions to
discussion groups, may refer to actual patients case histories.
As medical data must be kept private, access to user-generated content
should be restricted to authorized users. However, as the peer-to-peer
CDN is generally untrusted, authorization can only be performed by the
original servers. The simplest solution is to partition all content
into two categories, public and private. Public content will be
accessible through the CDN, while private content can only be accessed
directly through the corresponding server, which is protected through
SSL and proper authentication. However, this solution also has the
disadvantage of eschewing any scalability advantages of the CDN for
private content. Overall, from a security perspective, the issue is to
provide strong security guarantees for a relatively untrusted network,
while also remaining compatible with the existing web-based
infrastructure.
Rachel Greenstadt and Jean Francois Raymond (Harvard University)
Applications of Trusted Computing for
Medical Privacy
Download PDF
here.
Benjamin Grosof (MIT Sloan School of Management)
Rules Knowledge Representation for Privacy Policies:
RuleML, Semantic Web Services, and their Research Frontiers
We give an overview of how the field of rules knowledge representation
(KR) bears on the policy aspect of privacy, including current
techniques, theory, standards, and research frontiers. Our own
previous contributions to rules KR include the Situated Courteous
Logic Programs KR (SCLP), the RuleML emerging standard for Semantic
Web rules which is based on SCLP (co-founder), and their applications
to e-contracting, financial information integration, and trust policy
management in Semantic Web and Web Services.
Privacy policies can be viewed as a broad special case of trust
authorization policies -- those in which authorization decisions are
made about access to information. Such policies in today's
commercially deployed systems are usually well represented as rules,
but those systems' designs do not yet exploit the last decade's
research about rules KR, notably SCLP and RuleML.
This creates a major set of opportunities for privacy research.
First, today's leading Web standards for access control policies
(XACML) and client privacy policies (P3P) are based on rules.
Second, the policy aspect of Web Services, particularly for security,
is now a major focus of industry efforts in Web Services overall.
Third, a small number of vertical industry domains appear
to be suitable as early adopters/investors for this technology direction.
These verticals include financial services (we give examples), health, and
police/military.
Stanislaw
Jarecki (University of California, Irvine)
Patrick Lincoln (SRI International)
Vitaly Shmatikov (SRI International)
Handcuffing Big Brother: an
abuse-resilient transaction escrow scheme
We propose a new approach for privacy-preserving transaction escrow
that balances citizens' desire for privacy and the need of government
agencies to collect accurate information about financial, commercial
and other transactions, and to quickly identify certain patterns of
activities. Our escrow scheme provides a provable anonymity and privacy
guarantee to transaction participants unless their transactions match a
pre-specified pattern, or are subpoenaed by a court warrant.
Neither selective disclosure, nor efficient subpoena can be implemented
using conventional public-key escrow mechanisms. Moreover, traditional
escrow schemes for protecting keys, identities, and data assume that
the escrow agency is trusted not to perform unauthorized searches on
the data, leak the keys to third parties, and so on. They are vulnerable
to the insider threat: a malicious or careless employee can exploit or
disclose citizens' personal data without authorization. By contrast,
our transaction escrow scheme is provably secure against malicious
misbehavior by the escrow agency's employees.
The key innovation underlying our technology is "verifiable transaction
escrow." We propose to equip existing commercial and governmental
databases and other information processing centers with transaction
escrowing capabilities. Transaction participants will encrypt the data
themselves, but correctness of the escrows will be verified, in a
privacy-preserving way, using efficient zero-knowledge protocols. This
will guarantee that the escrow agent can de-anonymize the entries and
remove the encryption *if
and only if* the data match a
certain pattern or one of the transaction participants has been
subpoenaed. For example, a national security agency may collect
encrypted passenger itineraries from commercial airlines and require
automatic disclosure for the records of any passenger who traveled to
the Middle East 5 times or more within a year. In another application,
a financial regulator may require automatic disclosure of all transfers
to a particular group of accounts as soon as the total amount of these
transfers exceeds $10,000 - even if the transfers are performed using
different banks and wire services! The transfers not matching this
pattern will remain completely anonymous and undecipherable even while
stored in government-controlled databases, thus alleviating concerns of
privacy advocates. Until their creator is subpoenaed, it is provably
infeasible even to determine whether two entries refer to the same
individual or not.
The key features of our transaction escrow scheme are (i) selective
disclosure for transaction records that match certain patterns, (ii)
complete anonymity and privacy for all other records without requiring
the escrow agency to trust the subjects of monitoring or vice versa,
and without involving a trusted intermediary in every transaction, and
(iii) practical efficiency. Our approach also obviates the need for
independent auditing of database access. We provide strong
cryptographic guarantees that it is simply impossible to access the
database in any manner other than that explicitly permitted by the
selective disclosure policy.
The overall objective of our project is to provide a cryptographically
protected balance between citizens' privacy and the need of authorities
to collect certain well-defined information. In health reporting, law
enforcement, anti-terror, secure audit, and other applications, if
honest users were assured of the privacy of their data, there would be
higher levels of compliance and less need for privacy-by-obscurity.
Bret
Kiraly, Andy Podgurski, and Sharona
Hoffman (Case
Western Research University)
Security Vulnerabilities and
Conflicts of Interest in the
Provider-Clearinghouse*-Payer Model
Download the PDF.
Rick Luce (Los
Alamos National Laboratory)
Seeking a Sustainable Balance for
Governmental Technical
Reports: Public Access vs. Security
What happens in a world where the events of September 11 turn the
access to governmental reports from a shining example of public
reporting and accountability into a safety and security liability? When
the
definitions for what is sensitive information and who are legitimate
users is
dynamic and changes literally overnight, what are the requirements for
systems
that support and deliver such information? As early as 1994 Los Alamos
National Laboratory led the nation in making its technical report
literature available via the Web to the global community, only to face
new
definitions of what is or is not appropriate for Web access, and what
constitutes
sensitive information. Building on the LANL experience, this talk will
outline some of the issues that governmental agencies face in
publishing reports that on one hand contain information that legally is
required
to be widely disseminated, yet on the other hand may be of use in
unforeseeable ways by malicious actors.
Daniel R. Masys
(University of
California, San Diego)
Medical Data: It's Only Sensitive if
It Hurts When You Touch It
Effective health care is
built on a foundation of trust between provider and patient.
Trust, in turn, requires that confidentiality of personal information
be maintained, a principle that has been a cornerstone of health care
since the time of Hippocrates. Professional codes of ethics
regarding confidentiality became state regulations in the 20th Century,
and public concern over medical data privacy gave rise to the federal
Health Insurance Portability and Accountability Act (HIPAA) Privacy
Rule, which established uniform requirements for protecting the
confidentiality of medical data effective in 2003. Among the
provisions of the Privacy Rule are new rights of individuals to inspect
and copy their medical records, obtain a record of disclosures of their
data, request amendments to their records, and request restrictions on
how their medical data is used and who can see it. A companion
HIPAA Security Rule will become effective in 2005. The Security
rule addresses policy and technology requirements for protecting health
data contained in computers and transmitted over networks, that are
similar to best practices used in other industries.
Healthcare provides unique challenges with respect to data
security. Conventional role-based security is difficult to
implement due to the many (often underspecified) roles played by
individuals, institutions and processes, including primary and
specialist providers, support services, billing and payment
arrangements, public health and regulatory agencies. Society
upholds the notion of confidentiality but reserves the right to
pre-empt confidentiality protections in cases of communicable diseases
and other threats to public health. Special protections are
written into law for certain types of medical data, such as mental
health records, substance abuse, adoption, abortion, and HIV
status. Creating technology that recognizes these types of data
in narrative records of care is an unsolved challenge. Perhaps
most importantly, simple models of information security fail for two
reasons: the healthcare system is so complex that individuals
generally cannot comprehend the full effect of their decisions to
withhold or release medical data, and their preferences regarding
confidentiality may change over time in ways they cannot predict.
When an individual most needs to change a prior preference, they may be
unconscious or otherwise cognitively impaired and unable to do
so.
Currently available data suggests that far more harm (in the form of
medical errors) results from lack of accessibility of medical data than
from breaches of confidentiality, and evidence is accumulating that
HIPAA is stifling clinical research, epidemiology, and the advancement
of health science. Practical measures such as the assignment of a
healthcare-specific unique identifier for each person have underpinned
national health systems in other countries, but have been blocked in
this country by privacy rights advocates, contributing to an
inefficient and sometimes life threatening fragmentation of care. As a
nation, we seem to be incapable of achieving consensus on the
appropriate uses of personal health data.
New models exist for “e-consent” and healthcare-specific role-based
security, but these technologies have yet to be tested and widely
accepted. Whatever the pathway of future innovation, it is clear
that an electronic security infrastructure for medical data will need
to have great flexibility to adapt to context-specific preferences and
policies, and will need to incorporate understanding of the semantic
content as well as the structure of healthcare records.
Nina Mishra (HP Labs/Stanford) and Kobbi Nissim (Microsoft)
How Auditors May Inadvertently
Compromise Your Privacy
Download PDF
here.
Prakash
Nadkarni,Rohit Gadagkar, Charles Lu, Aniruddha Deshpande, Kexin Sun,
and Cynthia Brandt. (Yale University)
Security in the context of a Generic
Clinical Study Data Management System
TrialDB is a generic clinical study data management system (CSDMS) that
is used at Yale by a several departments, as well as by several centers
nationally. In a system that is intended to facilitate the logistics of
prospective clinical trials, authorities such as Daniel Masys have
pointed out that there is a conflict between the need for maintaining
patient anonymity, and the support of automation for tasks such as
patient appointment and follow-up: if a policy is made not to store any
form of personal health information in the database, such automation
becomes impossible. Also, in situations such as cancer chemotherapy,
where the decision as to whether or not to escalate the dose of highly
toxic drugs is made based on patient response and occurrence of adverse
effects, the immediate consequences of accidentally escalating dosage
for the wrong patient - namely, death or disability - far outweigh the
risks of identity disclosure.
Our first step towards implementing secure practices was to have a
policy in place that specifies the appropriate level of confidentiality
for a given type of clinical study (e.g., a retrospective study vs. a
prospective one, or a survey vs. one that involves major therapeutic
interventions which are themselves associated with significant risk).
Second, patient-related data security is not the only kind that must be
considered: one must define what the various types of participants in a
particular study - investigators, administrators, data entry personnel
- need to know in order to function and what they do not: the
definition of standard "roles" greatly assists system design and
implementation. The technical aspects of implementing security are
considerably easier than they were a few years ago, with mature
toolkits such as the Microsoft .NET framework shielding the developer
from the low-level details of particular encryption or message-digest
algorithm.
Zachary N. J.
Peterson, Randal Burns, and Adam Stubblefield (The Johns Hopkins
University)
Limiting Liability in a Federally
Compliant File System
Download PDF
here.
Daniel Schutzer
(Citigroup)
Financial Services Viewpoint: Towards
the Management and Handling of
Sensitive Data
The needs of society, financial institutions, and the individual,
regarding the use, handling and management of sensitive data
(especially financially-related data) is complex and often in conflict
with one another. The need to audit financial transactions, to detect
anomalous behavior and obtain evidence that can hold up in court is
necessary in support of fraud and threat risk management, dispute and
error handling, incident response and forensic analysis, and regulatory
reporting requirements. This need often conflicts with the individual's
concern for privacy and desire for anonymity. Customized customer
service, sales, and products requirements often introduce further
conflict with indiviudal privacy needs. The introduction of strong
authentication (e.g. multi-factor), access controls, and information
protection technology (e.g. cryptography and trusted agent technology)
could serve the dual needs of personal privacy protection and enhanced
security if they could be designed with all these competing needs in
mind. Unfortunately seven key design requirment hurdles need to be
overcome in order to simultaneoulsy achieve these complex interacting
needs. They are:
- Be implementable and operated at affordable cost to both customer
and FI.
- Be easy, convenient and intuititive.
- Be compatible with prevailing accepted customer behavior.
- Be able to support efficient targeted marketing, product and
service customization and fraud and threat risk management information
needs.
- Be able to preserve a customer's sense of privacy and control
over their personal information.
- Be able to support trusted communications, not only between
Financial Institution and customer to each other (assured mutual
authentication), butwith third parties (e.g. auditors, regulators,
merchants, distributors, IT product providers, and trusted
intermediaries).
- Allow for all of the above to work in the presence of real world
software, that exists with known and unknown bugs and vulnerabilities,
in an evironment of continuous patching and in the presense of social
engineering fraud schemes, such as phishing.
Anna Slomovic
(Electronic Privacy Information Center)
Health Data Flows: Where Pets Can
Help
Patients and physicians have increasing privacy
concerns as health information moves into electronic form. Patients
are concerned because, despite Notice of Privacy Practices, they do
not know who sees their health information in the normal course of
business and have no mechanism for controlling access. Physicians are
concerned because electronic health information systems raise the
possibility that their actions can be tracked and available for
analysis by third parties, such as accreditation agencies, insurers
and licensing authorities.
In 1997 the National Research Council report For the Record:
Protecting Electronic Health Information identified two categories of
privacy and security concerns: concerns about inappropriate releases
from individual organizations and concerns about systemic flows of
information throughout the health care and related industries. This
talk examines permissible data flows within and between health care
organizations and recommends areas in which technical solutions may
create a more privacy-friendly environment.
Although health care organizations are creating policies and
procedures to minimize risks of inappropriate disclosures, many
ÒauthorizedÓ users do not need to have individually
identifiable
information to do their jobs. In fact, health care organizations often
have special procedures for VIPs or celebrities to prevent them from
being subject of curiosity. There are several areas in which technical
solutions can help limit information within health care organizations
without compromising quality of care.
Risks in information flows between organizations arise because the
health care system is complex, fragmented and includes many different
types of organizations. Many data flows fall into the categories
permitted under the HIPAA Privacy Rule, such as treatment, payment,
health care operations, public health, and disclosures required or
permitted by law. There are areas in which technology can play a role
to limit disclosures of individually identifiable information in ways
that would protect patients and physicians while permitting
researchers, public health authorities, and others.