April 22, 2004

Concurrent 8: Data Retention and Privacy: A Real World Approach to EU and US Regulations

In order to get this session report to you as quickly as possible, I'm going to just present below my lightly-edited notes, which are mostly in the voice of the speakers, and close to a transcript. Things that really stood out to me are in bold.

Cindy Cohn, of the Electronic Frontier Foundation, says one thing that didn't make the cut in the Patriot Act was a data retention law. An ISP doesn't have to save all your email and other data as it is now. But, there are storm clouds on the horizon. Switzerland is looking at this as well as others and it will affect the US. The U.S. is doing law enforcement for international groups already because we do data retention for other purposes.

We should think about data logging and decide what we want to do. At EFF, we've developed tools to make that easier: Jeff will talk about it.

Data retention isn't an affirmative requirement for ISPs right now. There is a data preservation law now, part of EPCA, such that ISPs must preserve data upon an affirmative notification from law enforcement. The standard that law enforcement must meet to be allowed to make that request needs to be high enough, but this sort of thing is here to stay. Right now, once the request is granted, data is collected for a 90-day period and then the warrant must be renewed. EFF only has problems with the specifics of this law, but thinks that when law enforcement has met its burden, they should be allowed to collect such data. Civil discovery disputes impose similar requirements once you are on reasonable notice of litigation, you have to retain certain data.

Your ISP probably collects massive amounts of data about your Internet usage. Some of this is congenital to hackers. Hackers are pack rats and they build systems that do the same thing. Apache defaults to save everything. So do other webservers. But, we're not all pack rats nor do we all want to be. These logs are useful for data networking problems, spam problems, hacking, etc. But, we have gone to ISPs and said: look at 1) what do you gather and 2) what do you use, and they save significantly more than they ever need/use. Mostly they could turn that logging on when they need it and accomplish the same goal. EFF has an on/off logging switch to use when needed. ISPs should try to log wisely.

ISPs should be thinking about the fact that the consequences of you having data are severe. If you have it and get asked for it, you have to turn it over. But, the law does not require you to create data you do not have. This is a big distinction. Indymedia taught me this lesson.

Indymedia does media in a non-hierarchical way. They run an open wire service. It's run off a server in Seattle, at least in 2000, it was. This happened pre 9/11. They received a court order from Federal court demanding all info including IP addresses off their servers for last few weeks. Apparently, somebody had found Quebec's security plans (in a parked car!) for an upcoming anti-globalization rally, and posted them on the wire. Indymedia got asked for EVERYthing, not just specific data. They realized they had this stuff going back several years. Given the kind of organization Indymedia was, law enforcement had found a pretextual reason for getting lots of juicy data on Indymedia users.

"On the Internet, no one knows you're a dog." Well, your ISP always knows you're a dog. Your IP address leads to you and your address and probably your credit card info. In the end, Indymedia didn't have to turn it over due to EFF's work. Indymedia's ISP had a lot of the data too. Speakeasy is a good ISP, but they didn't share Indymedia's goals.

EFF has considered doing a survey of ISP's regarding their data collection practices, but the ISPs seem extremely resistant to give out that data. So, talk to your ISP. See what you can find out. ISPs should be asking themselves, "What are our actual needs, not our theoretical ones?" System administrators seem to have this fantasy: The CEO is going to come in and say, "I need this e-mail from four years ago. It's crucial to save the company!" Then the admin will grep the log for the e-mail, find it and be the hero. We need to let go of the fantasy. The more realistic threat is the Feds busting down the doors, and asking for all your logs. The new hero scenario is when the sys-admin says, "I can give you the last seven days, but after that it's gone."

An ISP can cost its customers thousands of dollars if that customer ends up in civil litigation and have to pay a lawyer to sift through all those logs. Get out your checkbook, and write a check that has at least five zeros behind it.

ISPs can create a reasonable expectation of privacy in our emails. If ISPs always keep data, then people aren't reasonable to believe ISPs won't and so we have no privacy under the law. Also, we just want a world where law enforcement doesn't search with a driftnet, but with a line and a pole.

Jeff Ubois, researcher at SIMS at UC Berkeley looked at the usage log issue last year. Check the SIMS server for more info. Produced a tool for scrubbing web logs. Technically it looks easy, but there are lots of constituencies that make this hard, and the legal issues are unclear. Is an IP address personally identifiable information? If so, a whole set of laws kick in, but if it's not, different legal result. Customization of sites: defaults matter a lot. Someone is working on that. What happens is that optimistic marketing people think they can derive valuable information from logs sometime in the future, such as what people in different geographical areas want, where are customers coming from, etc. The logs are also useful to verify transactions, track security problems, improve the user experience and user interface design, how people move on a site from page to page, etc. There is a reluctance to throw data away, but there is liability in keeping it and people are learning this.

Some of the logging laws: FERPA, the Family Educational Rights and Privacy Act, is relevant to when an IP address becomes personally identifiable information. Is logging at University campuses subject to FERPA? Yes. Another law enforcement use: The Danny Pearl kidnappers were using the e-mail address kidnapperguy@hotmail from a proxy server in Pakistan, and were convicted through tracing. So, some good comes from enabling logging. We made a simple log scrubbing tool that at a minimum removes everything after the question mark in a cgi script and the last few digits of the IP address. An EFF intern from last summer has some code and needs help, if you're interested. [This is a cool idea! Where's the code?]

Andrea Monti, of Electronic Frontiers Italy, which is not related to EFF, but has been in touch with them since 1994, as they share similar concerns.

Three Topics
1. Data retention approach not focused on privacy protection.
2. What happened in the last few months while passing a law enforcing data retention.
3. How data retention affects rights of criminal defense, child porn, defamation, etc.

It's not just privacy. Every kind of cybercrime is based on digital evidence, and every computer crime is based on IP addresses as a starting point for tracing that data. Collecting is important because your ISP creates the evidence and they are not bound to collect it in any form that guarantees any sort of integrity. Recently, law enforcement in Italy failed to match times and dates in servers in an investigation of an individual suspected of possession of child porn. Police went to his house, seized everything, but didn't find anything. Still no evidence but under judgment since four years ago. Still need to start this trial!

You lose your right to a fair trial if evidence is not properly collected, handled and analyzed. You will be indicted without any concern for privacy. Last December we, in Italy, had a duty of data preservation law passed. Traffic data was to be collected. It was passed as a "decree law" which is an urgent law issued by the government and not by parliament. This is non-standard. The early draft took the path that ISPs must be bound to retain every kind of Internet and telephone traffic. Email retention especially. Several modifications of this text were classified. We later learned that this email data retention has been erased from the final text. The result is a messy law that talks about retention of telephone data handled for billing purposes which is a meaningless definition.

Two hidden dangers: 1) The principle that has been stated. Data retention is useful for criminal investigations. You can retain data not just for serious crimes but even for every kind of crime involving a computer thus creating a broader exception and dangers for civil rights, even for petty crimes. 2) Another weakness of this law is that it just talks about the obligation to preserve without enforcing any order about integrity of data. If the info is wrong, "we don't care", so what's the point in saving this data? What good is a rotten apple that you can't eat? We are told that this data retention is necessary for terrorism and serious crimes, but after just one month it was transformed into a much broader law. The ministry of cultural assets enforced the law for fighting "the plague of peer to peer". They also pursued a law to make it a criminal offense to use encryption to hide the transmission of any communication. (WOW!) To fight peer to peer it was important to retain all kinds of Internet data. So, what they can't get with a terrorism excuse they try with the copyright excuse.

In one case, all the police did was wander around the net, check id assignation and make a phone call or ask by fax to get logs. They performed all investigation by paper. A court expert reviewed the whole investigation process and the way the evidence was collected. He said the techniques are not reliable and do not constitute evidence in court. Forensic software is not good enough for trial purposes. Are the mistakes operator mistakes, user or both? Data retention is an evil we cannot avoid. The government wants to enforce this. However, we can push for a provision that creates accountability on law enforcement. You must grant a legal value to this data. That way I can preserve my right to a fair trial.

Susan Brenner, Professor at the University of Dayton School of Law, did not speak on data retention directly. Instead she explained a European commission project called CTOS. Cyber Tools Online Search for evidence. It picks up on Andreas' premises, i.e., the notion that there is a problem with criminal and civil standard procedures for collection of data. Lack of expertise opens this collection up to defense attorney challenges and there are problems on varying international legal standards regarding rules of evidence. Primarily CTOS focuses on cross-border cyber crimes. It tries to deal with these different legal standards across countries and thinks about how to do this collection without being subject to attacks of reliability validity.

The notion is to create an architecture that standardizes the processes for collecting digital evidence. Such collected data would be legally admissible, authentic, convincing and reliable. It closed a year ago last May where we saw "the demonstrator" which is a functioning version of CTOS.

CTOS differs from the Council of Europe Convention on Cybercrime's solution. Hopping borders makes it hard to collect evidence, and if collected incorrectly, more problems. There are also the varying legal standards. The Council of Europe Convention on Cybercrime suggests a convention of standard laws of evidence for data collection and retention across countries. CTOS instead says we don't have standardized evidence collection standards in lots of places, despite best practices that often aren't known about. So, CTOS is a software architecture that runs in three phases, the standard phase is where everything is working, then the suspicion phase, where we ask "did something happen?" CTOS prompts the system administrator or investigator through that process, but really helps with the third phase, the investigation phase. CTOS prompts you through the investigation. It gives advice, like check the timestamp. One prompt leads you to incident response policies and legal standards. It says things like: you should try this, you should be concerned about these things. It's very complex and hard to explain in 10 minutes.


I asked Cindy Cohn if consumers had any hope of getting a right to know what data our ISPs collect on us and how long they keep it, much like one is allowed to check one's credit report. And how about a right to ask for regular data flushing by your ISP, similar to opting out of telemarketing. Cindy says: No chance. Legislators are going to say the market can solve this problem. If consumers want ISPs to do these things, they will ask for them and the ISP that distinguishes itself by offering them will succeed. She encourages people to call their ISPs and see what they can learn.

Posted by brianwc at April 22, 2004 06:01 PM
Post a comment

Remember personal info?