Caste Data, Digitisation and its Problematic Impact on Privacy and Policing

This paper aims to problematise the low threshold of scrutiny applied by the Patna High Court in approving the Bihar Caste Survey without a concrete reason for the conduct of the same, especially in the face of India’s inadequate data protection framework, which fails to protect private data from misuse by the mechanisms of the State itself.

Anagha Damaraju

August 3, 2025 20 min read
Share:

I. Introduction

 

Much has been said about the State of Bihar’s  2022 initiative of enumerating every household within its borders and recording information relating to the demographic makeup, particularly that pertaining to caste and sub-caste. Labelled as a caste “survey” to avoid the legislative competency complications arising from the word “census”, the Bihar Caste Survey (‘Survey’) has been at the centre of controversy since it was notified. This is because only the Union Government reserves the authority to conduct a “census” under Entry 69, List I of the Seventh Schedule of the Indian Constitution, while state governments are empowered to collect statistics under Entry 45 of List III of the same schedule for any matter contained within List II or List III.

 

One aspect of this controversy is that the exact purpose for which this data has been collated by Bihar remains unclear to a certain degree.[1]  Though a scheme for increased caste-based reservations to the extent of 65% in government jobs and educational institutions was introduced nearly immediately after the release of the Survey data, the declaration by the State Government that the data was to be used for the purpose of affirmative action was not present in the initial notification issued—it was added in a counter-affidavit filed before the Patna High Court (‘Patna HC’) when the survey was challenged.

 

The shaky grounds for the conduct of the Survey, which maps the population of a state based on a criterion that has been the driving force of conflict and violence throughout its history, leaves one ill at ease with regard to what the consequences of the collection and use of this data and its approval by the State’s judiciary might mean for the future of Bihar’s, and subsequently, India’s oppressed classes.

 

The pan-India ramifications of the Patna HC judgment, which stated that Bihar had provided a sufficient rationale and objective for the conduct of such a survey, are manifold—a similar survey was conducted in the state of Telangana in November 2024, following approval from the state’s High Court. This too, has not been without controversy: with respondents standoffish towards answering enumerators’ questions, false ‘enumerators’ seeking personal details from respondents, privacy concerns as well as allegations of undercounting surfacing during and after the survey. The paper seeks to prove in a further section that the survey collected caste data, which was of a personal nature as an aspect of the respondents’ identity which affects one’s life in society—thus, it remains the respondents’ prerogative to choose to reveal their caste or to choose not to do so.   Since this data was collected via digital means, such data would then come under the ambit of digital personal data, and hence be subject to the tests for violation of the Fundamental Right to Privacy, alongside being subject to India’s digital data privacy laws as laid down in the The Digital Personal Data Protection Act, 2023 (‘DPDP Act’).

 

Centuries of caste marginalisation continue into the modern day in forms which have evolved to keep up with shifting technological dynamics. Digital personal data related to caste has frequently been subject to privacy-violating leaks, as well as misuse by State actors. One aspect of unavoidable individual-State interaction that has historically been, and remains to date, tainted by casteism is policing. From the colonial period till today, the police have been used as a tool to further caste hegemony by focusing resources on communities considered ‘immoral’ or ‘deviant’ in the rigid Indian social order.

 

Technological advancements have enabled a much more far-reaching and invasive method of the very same casteist policing, which is now centred around the use of digital personal data—the collection of personal and biometric details of “repeat offenders” by the police is one steeped in a casteist history, and is now exacerbated by the possibility of the interlinking of the police’s several digital databases, as the paper aims to explain in a following section. This paper argues that the Patna HC’s decision to approve a collection of sensitive data without a clear and defined aim provided by the government for its collection has the dangerous ramification of possibly validating the misuse of caste data of populations across the country, continuing the marginalisation and criminalisation of Dalits in India.

 

For this purpose, the article shall proceed as follows: Part II shall problematise the lack of a defined purpose with which the survey was conducted. Part III shall explore caste as a personal detail protected by the Fundamental Right to Privacy. Part IV urges the Indian judiciary to adopt a more rigid scrutiny of the purported purposes of such data collection exercises. Part V centres on how caste data has been recorded and weaponised against Dalit and Bahujan communities historically and in the present day, and how India’s wholly insufficient data protection framework fails oppressed classes, while Part VI shall conclude.

 

II. The tenuous grounds of Bihar’s caste ‘survey’

 

In Mohinder Singh Gill v. The Chief Election Commissioner (‘Mohinder Singh Gill’), the Supreme Court of India (‘SC’) stated that the validity of the order of a statutory functionary made on certain grounds must be evaluated on the grounds stated in the order itself, and not by supplementary reasons filed in an affidavit or any other form. Additionally, in Commissioner of Police, Bombay v. Gordhandas Bhanji (‘Gordhandas Bhanji’), it was the opinion of Justice Vivian Bose that an order made publicly to exercise some statutory authority could not be justified by explanations offered after the making of the order to clarify the intent of the exercise. As stated by the court in Gordhandas Bhanji, the reason for this rule was that those persons to whom the order may cause some detriment, were entitled to know what to do or forbear from doing.

If the respondents of the Survey had not been informed at the stage of data collection that the purported purpose of this exercise was to revamp Bihar’s affirmative action regime, Bihar is not entitled to claim that this was the purpose of the Survey by means of a counter-affidavit filed before the court, once the Survey was challenged before the Patna HC. The fact remains that the Survey was carried out without a specified reason for the collection and use of this sensitive data.

 

The Patna HC, in Youth for Equality v. State of Bihar (‘Youth for Equality’), declared the Survey to be made on valid grounds, stating that the reasoning in Mohinder Singh Gill and Gordhandas Bhanji would not apply to this exercise because it applied to administrative orders, and not notifications under Article 162, as such notifications were to be tested based on the decision-making process which lead to their passing. Since the counter-affidavit filed before the Patna HC, in the court’s opinion, was revealing of this decision-making process, it was held to be sufficiently declaratory of the Survey’s object. Article 162 demarcates the degree of power a State has to make executive orders, which may extend to the matters it has the power to legislate upon.

 

In confirming Bihar’s legislative competency to carry out the Survey, the Patna HC refers only to the Executive’s ability to take action under Article 16(1) and (4), which guarantee equality of opportunity to all citizen for public employment, alongside the savings clause allowing the State to reserve posts for those belonging to inadequately represented backward classes; Article 15(1) and (4), which lay down the Fundamental Right Against Discrimination and the power thereunder of the State to make provisions for the advancement of socially and educationally backward classes of citizens; and Articles 38 and 39 of the Indian Constitution, which are Directive Principles of State Policy requiring the State to secure a social order for the welfare of the people, and demarcating certain principles for the State to follow in policy-making respectively. Thus, it states that Bihar need not rely on statutory powers under the Collection of Statistics Act, 2008.

 

Since the action taken is neither statutory nor covered under the Census Act or the Collection of Statistics Act, this implies that Bihar is neither bound by the rules of confidentiality under the former, nor by the requirement to furnish the subject and purpose of collection of statistics under the Collection of Statistics Rules. This has the consequence of essentially enabling the provision of any ground for the collection and use of this data, leaving it vulnerable to use on grounds not communicated to the respondents of the Survey.

While considering the alleged ‘arbitrariness’ of the Survey, under Article 14 of the Constitution, the Patna HC focuses largely on the ‘intent’ of the State in undertaking the measure. Relying on the precedent in Madhya Pradesh Oil Extraction v. State of Madhya Pradesh to say that it was ‘trite’ that policy measures enacted by the Bihar Government were beyond the court’s interference, unless patently arbitrary, the Patna HC was satisfied with the reasons stated by Bihar in its counter-affidavit.

 

However, State action must be informed by reason—an act not informed as such would be arbitrary, stated the SC in Kumari Shrilekha Vidyarthi v. State of Uttar Pradesh (‘Shrilekha Vidyarthi’). In Shrilekha Vidyarthi, it was held that even if a discernible principle was prescribed for the undertaking of an act, the conduct of such an act in a manner without a discernible principle would afford it the vice of arbitrariness. Given that the execution of the Bihar Survey is seen by the Patna HC as not arising from either of the statutes specified earlier or subject to the protective provisions under these statutes, Bihar is, effectively, not bound to disclose the use of the data collected. This potentially nullifies the reassurances given of the confidentiality of the data and the purpose of its collection given in its counter-affidavit from this understanding of arbitrariness, as there exists no legal obligation on Bihar to honour the stated purpose of the Survey nor the confidentiality of the data, possibly rendering the exercise arbitrary from this lens.

 

III. Caste as a personal attribute

 

In the face of state power compelling one to reveal their caste identity, a rather sensitive aspect of one’s personhood that could expose one to undue harm and discrimination, a respondent to such a survey should ideally, have the option to refuse to reveal such a detail, citing their ‘privacy’ as a reason. In determining whether the matter of one’s caste identity is a “personal” attribute worthy of protection under the Fundamental Right to Privacy, one may look to Justice Sanjay Kishan Kaul’s concurring opinion in Justice K.S. Puttaswamy I v. Union of India (‘Puttaswamy I’). The learned Justice warned against the possibility of state actors violating one’s privacy through means of data collection, surveillance and profiling, the last of which could result in discrimination based on caste.

 

It has been argued that caste is inherently a marker of one’s public identity and not a private aspect of one’s identity, inexpungeable from public life as an aspect of identity that casts its pall over social life in India. Though the Patna HC in Youth for Equality did not comment on the nature of caste as a private or public detail, the judges of the SC seem to align with the former view, with then-Justice Sanjiv Khanna orally remarking that “99 per cent” individuals may be willing to waive the right to privacy and disclose their caste identity during the proceedings of the challenge to the Patna HC’s decision in the SC. The Patna HC also dismissed the privacy concerns presented by the petitioner on the grounds that no complaints had surfaced from the respondents yet.

 

This attitude towards the disclosure of sensitive information about oneself, such as one’s caste, ignores certain facets of Article 21 of the Indian Constitution that make such an exercise more complex. Firstly, in Shafin Jahan v. Asokan K.M. & Ors. (‘Shafin Jahan’), the SC has ruled that a part of the “inherent liberty and autonomy” in each individual includes the ability to decide which aspects determine one’s personhood and identity. Secondly, in Puttaswamy I, the ambit of “privacy” has been decided to include informational privacy—that which deals with an individual’s mind, meaning that individuals ought to have control over what information related to their person is disseminated, with protections against the unauthorised use of such information. The right to know the purpose for which the data collected is being used is also guaranteed via Puttaswamy I.  Thirdly, the same judgment protects the privacy of choice, alongside the freedom of self-determination, which harks back to the idea of being able to choose what aspects of one’s identity define oneself as discussed in Shafin Jahan.

 

This freedom, coupled with the protections afforded by the Fundamental Right to Privacy, implies the possibility of individuals possessing the ability to relegate their caste identity to the private sphere of their lives, choosing not to be defined by it—essentially, the right not to disclose such information if they choose to deem it irrelevant, even before the coercive power of the State.

 

IV. Caste surveys and Proportionality

 

The SC’s decision in K.S. Puttaswamy II (Aadhaar) v. Union of India also reiterates the test of proportionality, which a measure abridging the right to privacy must satisfy before it is deemed valid. One aspect of this test is that a designated proper purpose must exist for the measure being undertaken. As mentioned previously, according to the law in Mohinder Singh Gill and Gordhandas Bhanji, no such designated proper ground existed in the notification of the survey. Therefore, the action of the State of Bihar demanding that its residents reveal information that they might consider private cannot stand. The fact that the purpose for which the data is to be used is not revealed to the respondents poses a danger to their privacy and opens the data up to the possibility of misuse.

 

Another aspect of this test that such a measure must fulfil as per the Aadhaar case, is that the measure undertaken must be ‘balanced’ in its infringement of the right and the achievement of its stated objective. If one were to, like the judges of the Patna HC, accept that the law in the aforementioned cases would not apply due to Bihar’s exercise of Article 162 rather than authority under a statute, none of the data protection provisions laid down in said statute would apply. In its judgment, the Patna HC does not engage heavily with the possible privacy concerns arising from this lack of obligation to adhere to these statutes, and takes the claims made in Bihar’s counter-affidavit at face value to declare that the privacy of respondents is being protected by the State.

 

The Patna HC’s deferential attitude towards Bihar in its conduct of the Survey is exemplified by its cursory consideration of the test of proportionality in Youth for Equality—the court examines the Survey based only on its possession of a “legitimate state interest”, and deems it proportionate, without applying the full test to the Survey.

 

As the following section explains, the data collected, in the absence of absolute proof of its use only for a declared welfare objective, can lead to serious ramifications, with a history of caste data being used to profile and police individuals belonging to oppressed communities. The lack of a statutory obligation or an order of the court binding Bihar to keep the data collected confidential leaves the respondents without safeguards or a grievance redressal mechanism in case of misuse. A caste data regime that complied with constitutional requirements, ideally, would be one that firstly, gave a respondent the option not to declare their caste at all, given the right to autonomy and informational privacy discussed earlier; secondly, had its purpose declared at its very outset; and thirdly, afforded to respondents the appropriate statutory data protection provisions required for them to proceed against the State in case of misuse.

 

V. Data, Caste and Policing

 

Though the process of policing in India evolves with technology, it seems unable to shake off its casteist leanings in terms of profiling and criminalisation of Dalit and Bahujan groups. This section seeks to explain the role that data plays in this criminalisation and how India’s inadequate data privacy framework presents an opportunity for the misuse of personal data collected by State actors in policing.

 

A.    The Role of Data in Caste-based Surveillance and Policing 

 

A pervasive aspect of casteist policing in India is one of surveillance—Radha Kumar writes that the colonial government’s chosen method mapped out the country based on population classification and then determined the frequency with which the area would be patrolled by beat policemen. Seemingly ‘unbiased’ police records would note offenders’ names, criminal histories, as well as caste identities, which would then come to be a determining factor in the allocation of police stations. A decline in station construction post-Independence led to the communities most surveilled by the police remaining largely unchanged.

 

The present reveals little change in the desire of the police to surveil populations it finds ‘likely’ to commit offences—only the methods for the same have changed.  A predictive police system reliant on interlinked databases supported by technology companies has emerged—examples of this are seen in Honeywell’s partnership with the Bhopal police for their Integrated Video System Management Project, a city-wide CCTV initiative, or Nippon Electrical Company’s NeoFace facial recognition technology and vehicular number-plate recognition technology being used to assist the Surat police. As governments digitise their police records, this phenomenon has come to create a big tech-based system designed to continue the criminalisation of the marginalised. The Transnational Institute highlights that multiple applications, software and databases are utilised by the police force at present, collating data of an unknown amount and nature without a known ultimate goal. Though they presently exist independently of each other, they can easily be interlinked to collate personal information on all citizens, allowing for more pervasive institutional profiling, which can then be used to justify differential treatment for those from marginalised communities that the police see as “habitual offenders”—the very same phenomenon Justice Kaul feared in his opinion in Puttaswamy I.

 

In Telangana, a state in which a caste ‘survey’ of a similar nature to the Survey has been conducted, this data-based police surveillance is already a reality in its capital city of Hyderabad.  Telangana’s Smart Governance Programme is built to integrate several datasets to create comprehensive profiles of all citizens, not just criminals, in the form of the ‘Integrated People’s Information Hub’. In 2018, the state embarked on an endeavour to enumerate all “professional” and “repeat” offenders, even those acquitted due to a lack of evidence in its Comprehensive Criminal Survey—with sensitive data such as biometrics, geotagged locations of their residences as well as family trees—a detail inextricable from caste—being recorded.

 

In Hyderabad, the sizeable minority community of Muslims is ghettoised, particularly those Muslims belonging to lower castes, in the Old City, which has been subjected to the Hyderabad Police’s Mission Chabutra, which involves stop-and-search operations where late-night “loiterers’” fingerprints are forcibly recorded. The HYDCOP app, which makes available sensitive information such as an offender’s Aadhaar details, has been developed by little-known private companies WinC and Tecdatum, much like Bijaga, the app used by Bihar to enumerate caste data, was developed by Trigyn Technologies, a similar private firm. This data is used to classify areas as “crime hotspots” to aid predictive policing, a classification deeply interlinked with the continued surveillance of “habitual offenders”.

 

B.    Gaps in India’s Personal Data Protection Framework

 

The digital data collected by the State using the technology developed by these private companies is not adequately protected under India’s present data privacy framework. The DPDP Act defines “personal data” as that data of an individual by or in relation to which he is identifiable. This individual is described as the “Data Principal”; those who determine the purpose and means of processing data are labelled “Data Fiduciaries”, and those processing the data on behalf of Data Fiduciaries are termed “Data Processors”.

 

However, the DPDP Act gives the State wide exceptions in compliance to its own rules—§17(1)(a) essentially allows the State as a Data Fiduciary to process the data with the near-blanket suspension of its duties to Data Principals if data is processed “in the interest of prevention … of any offence or contravention of any law for the time being in force in India”. This essentially validates the State’s collection and processing of the personal data of citizens in the exercise of preventative policing—a process marred by caste prejudice, as elaborated in the previous sub-section. The European Union’s General Data Protection Regulation (‘GDPR’), meanwhile, specifies that even if there exists a “legitimate interest” exists for the processing of personal data, the power of the “controller” in charge of determining the means and purpose of this processing to process the data, remains limited by the fundamental rights and freedoms of the data subject.

 

The DPDP Act, unlike the GDPR, makes no special categorisation for data that may be considered more sensitive. Article 9 of the GDPR places both data related to “racial or ethnic origin” and “biometric data for the purpose of uniquely identifying a natural person” in a special category that prohibits their processing. This safeguard is lifted only with the consent of the data subject, for a ‘specified purpose’, the importance of which has been emphasised throughout the article. Even when the data is used for purposes of “substantial public interest”, it must be proportionate to its object, and provide specific measures for protecting the fundamental rights of the data subject.  Seeing how incredibly sensitive data, such as one’s caste identity, takes the form of digital personal data, both in the context of the Survey and policing, and is often combined with biometric data for surveillance for the latter purpose, the Indian law does not fulfil its role in protecting aspects of the country’s most vulnerable groups that can subject them to marginalisation.

Personal data related to caste, when collated digitally without a prescribed purpose, as was done in the case of the Survey, poses immense potential for weaponisation and the continued marginalisation of the very population that it purportedly seeks to protect, with very few recourse mechanisms protecting this data from misuse.

 

VI. Conclusion

 

Though the exercise of the Survey and other surveys following it with a similar methodology of evaluating the percentages of individual castes and sub-castes at a granular individual household level was purportedly noble in its belatedly declared object of expanding affirmative action for the oppressed classes in Bihar, the approval by the Patna HC of Bihar’s undertaking has led to unseen ramifications in the form of privacy violations arising from the understanding of one’s caste as a personal characteristic that one may wish to protect.

 

The Patna HC might have acted with haste in allowing a survey such as this to be conducted without a specified object, essentially allowing the violation of the privacy rights of the most vulnerable section of its population, exposing their data to potential misuse by State mechanisms which have historically marginalised them and rendered spaces inhabited by them carceral. This disproportionate and arbitrary measure, in view of the wholly inadequate data privacy framework afforded to individual Data Principals before the State as a Data Fiduciary, requires courts to exercise greater caution in their approval of actions that do not specify the purpose of data collection at their outset.

 
 

[1] See s. 2, Bihar (In admission in Educational Institutions) Reservation (Amendment) Act, 2023; s. 4, Bihar Reservation Of Vacancies In Posts And Services (For Scheduled Castes, Scheduled Tribes And Other Backward Classes) (Amendment) Act, 2023.

Anagha Damaraju

Anagha Damaraju is a third-year student at the National University of Juridical Sciences, Kolkata. Her interest lies in public law. She can be contacted at llb223074@nujs.edu.

The Legacy of Capital Sentencing Discretion: Unpacking the Unfair History Behind ‘Fair’ Powers of Discretion

June 28, 2025