Not Open or Accountable: The Government Open Data Use License Is Flawed

The consultation process surrounding the new license failed to help protect accountability, privacy and user rights.

The ministry of electronics and information technology carried out a public consultation for an open license for government data use in August 2016. The draft license prepared had provisions designed to suppress right to information and hold no government official accountable. Every major organisation’s submission for the consultation opposed these provisions. However, these objections were ignored by the department at the end of the consultation process.

There was virtually no major change between the draft and the final license version, essentially making the whole public consultation waste of time.

India’s open data ecosystem has not grown significantly over the last few years — especially when compared to ecosystems seen in the European Union or the US — even though the National Data Sharing and Accessibility Policy (NDSAP) was introduced in 2012.

The growth around it was dampened by several factors along with government data being under copyright in India. As all the data being published through open government data platform had the copyright apply notice, it created ambiguity around licensing for data users. After almost 4 years of the data.gov.in portal going live, a open government license limiting provisions granted to citizens under the Right to Information Act is out.

Government Open Data Use License

The draft of the Government Open Data Use License has been prepared by a committee lead by Suresh Chandra, (Law Secretary, Department of Legal Affairs) including representatives from several government departments, academia and civil society organisations. The scope of license applies to all data being published under NDSAP; and also data published through data.gov.in. The license covers issues like attribution, distribution and usage permissions, but there are certain provisions which are normally not found in a license. Section (6) of the draft license has the following parts:

“The license does not cover the following kinds of data

(a) Personal information;

(b) Data that the data provider(s) is not authorised to license, that is data that is non-shareable and or/sensitive

(f) Identity documents; and

(g) Any data that should not have been publicly disclosed for the grounds provided under Section 8 of the Right to Information  Act, 2005”

The above exemptions sound like guidelines aimed to data providers (to not publish datasets that contain those types of data), but are clearly directed at data users. The policy principles have not been followed while defining these exemptions as:

  • Personal information can’t be published by a data provider under the policy.
  • How will a data user know if the data provider had authorisation to release a data set?
  • Identity documents such as birth and death certificates are public. How is access to it restricted?
  • Section 8 of RTI deals with non-obligation of state to release data for citizens and is clearly not applicable to data user of the license.

There are certain issues which NDSAP did not cover during its formation and doesn’t clarify:

  • It doesn’t clearly distinguish how the classification of data should be done, instead letting departments take the decision, thus creating ambiguity in type of data classifications among data providers.
  • It doesn’t provide how the negative list, sensitive datasets, should be managed  either by using cyber security practice guidelines or policies.
  • It doesn’t clarify how an oversight committee should ideally monitor every new dataset being generated by various government departments nor does it include provisions to safeguard data as an asset.
  • The policy fails to recognise the necessity of  data skills in government departments and doesn’t mandate any capacity building mechanisms.
  • The timelines envisioned in the policy were too short for departments to act on it with enough consultations. The policy expected every public dataset to be uploaded within one year of notification i.e by March, 2013.

Clearly the committee was hoping to rectify some of these issues within NDSAP by including certain clauses in the license. They instead ended up making the license complicated by mixing it with bizarre policy statements.

Warranty of Data

Section 4 of the draft license has certain clauses of no warranty, no continuity for the datasets being published. Any data being published under this license has no warranty. That basically implies you can’t make the data provider liable; the data provider here is technically a department in government and not an individual alone.  This clearly violates our fundamental right to expression (by limiting access to information) and access to disclosure of public records under Article 19 (1) of the constitution and has been upheld by Supreme court in the case of Raj Narain vs State of UP.

Section 4, clause (d) ‘No Warranty’ also states: “The data provider(s) are not liable for any errors or omissions, and will not under any circumstances be liable for any direct, indirect, special, incidental, consequential, or other loss, injury or damage caused by its use or otherwise arising in connection with this license or the data, even if specifically advised of the possibility of such loss, injury or damage. Under any circumstances, the user may not hold the data provider(s) responsible for: i) any error, omission or loss of data, and/or ii) any undesirable consequences due to the use of the data as part of an application/product/service (including violation of any prevalent law).”

Clause (e) Continuity of Provision states: “The data provider(s) will strive for continuously updating the data concerned, as new data regarding the same becomes available. However, the data provider(s) do not guarantee the continued supply of updated or up-to-date versions of the data, and will not be held liable in case the continued supply of updated data is not provided”

These clauses are outrightly against the rights guaranteed under the RTI act: government documents have some warranty and are definitely admissible in the courts. Every public document out there could be brought under NDSAP and will potentially fall under this license with a no warranty clause. Clearly a dataset that contains the text of various Supreme Court judgments has some warranty. The scope of license is too vague and if it wants to be the default license for all data of the government, if it wants to replace RTI, it needs legal basis to do so. The license is stepping its boundaries and needs to remove the clause of no warranty for government data. The Indian Customs and Central Excise Department for instance has shut down access to a high-value open dataset of every product being exported and imported of the country after demonetisation, moving away from providing a continuous supply of public information and data.

Data mismanagement and security

The National Cyber Security policy of 2013 rightfully identifies data leakages as a cyber threat and sets safeguarding privacy of citizen data as an objective. Personal information or any other sensitive data typically falls under negative list of NDSAP and thus clearly needs to be handled with care by potentially encrypting it. While India’s draft encryption policy was a disaster, we still have the Information Technology Act which mandates the central government to provide minimum guidelines to be followed to secure data from theft; there have been none so far. But a license is no place to announce these intentions or restrictions on a potentially published/leaked dataset.

Data has been mismanaged countless times by government officials, but publishing personal information knowingly or accidentally and trying to regulate it through a license has been never heard off or have been done in practice.  During a hackathon in 2015, Bangalore Police released the call data records of people who were potentially under investigation and called it ‘open data’. On the launch day of Sikkim’s open data portal, two datasets revealing names, religion, caste and other personal information of students and teachers in Sikkim was released. During the public consultation of net neutrality in India, the Telecom Regulatory Authority of India (TRAI) published the email addresses of every respondent. All the datasets in question violate the very definition of ‘open data’ and were reported responsibly and taken down. Accidents like these can happen again and will need a legal framework to stop them. Current frameworks will not let data users to appropriately report cyber incidents to any authority at all.

Passing the responsibility of not accessing sensitive data on to the data user, and making the distributor not liable, threatens every user and the community that is built around that data. Again to quote an example, in 2014 government started publishing details of every RTI request. In doing so they were exposing personal details of individuals making the RTI requests. When questioned about the personal information being published part of the RTI requests, officials were quick to respond they can’t afford to redact the personal details due to lack of resources. Yet Indian Railways was found digitising names, addresses of individuals who filed the RTI’s, An RTI request I made is accessible to public using railways search portal. The railways should be made liable for publishing personal information on the web without providing enough security. Incidents like these may stop people and activists from filing RTI’s; security through obscurity won’t help us.

User and citizen rights

Government departments inevitably have personal information about India’s citizens. Some of this personal information, like electoral rolls, is public and has been easily accessible by marketing agencies or political parties for their own data needs and analysis.

Initiatives around Aadhaar and Digital India are creating multiple interconnected databases, which are prone to all sort of data mismanagement issues. For example, most details around a student’s performance is being recorded in certain states. Someone from Hyderabad used education board data to lure young girls under the pretext of counselling and ended up sexually exploiting them. What can you do as a parent to stop your child’s data becoming public without any laws in place?

Digital rights have been at center of debates around Internet with the rise of apps, which more often than not either obfuscate a consumer’s rights or simply take them away.  Service providers on the Internet often have complex terms and conditions, which take away your basic rights and give them unlimited leverage over you and your data. Dropbox, Facebook, Google can, at the drop of a hat, suspend access to your own account, making you lose your personal data without giving any reason at all.

It is unfortunate that open data is also being crippled with these strange terms and conditions, designed to make sure that there is no accountability if mistakes are committed by government authorities.

The recently released BHIM app also fails to acknowledge the usage of open source software anywhere in the application.

The disclosure and usage of software licenses in government applications is low, most officials may not be even aware that such licenses exist and seem to always have copyrights on open source work. In a democracy, public data and documents help bring transparency and accountability. But restricting the usage of these documents and data can harm us than help empower citizens. At the same time government departments and agencies are digitising their records faster than ever before, collecting personal sensitive data at every stage. Initiatives like Digital India and the Smart Cities mission can be both boon and bane, helping digitally empower citizens while also creating new problems of security, privacy or a new digital inequality for them. Open data is relatively a new concept and can harm developing countries if we don’t tread carefully. What works for the West may not necessarily work for us. Open data is “my data, your data and our data”. Let’s be careful with it and make sure we as citizens have our say in keeping officials transparent and accountable.

Disclosure: The author is co-founder of Open Stats, a startup focused on opendata.

Srinivas Kodali is an interdisciplinary researcher working on issues of cities, data and internet. He volunteers with internet movements and communities