The draft Digital Personal Data Protection (DPDP) rules of 2025 have triggered a debate among the stakeholders involved in developing indigenous Large Language Models (LLMs) and those closely watching the space.
The rules, grounded in seven fundamental principles of data privacy, aim to create a balance between safeguarding user data and encouraging innovation.
The potential impact of the DPDP rules on the LLM landscape in India is multifaceted. While they promise to foster the development of homegrown LLMs by addressing data localisation concerns, the increased compliance costs could prove to be entry barriers for small local players. Experts also see the risk of global LLM providers excluding India from new releases like the way they are doing in Europe.
Jaspreet Bindra, Co-founder of AI&Beyond, felt that the DPDP rules, if implemented as they are, could significantly impact the development, deployment, and operation of LLMs both globally and in India.
“Explicit and revocable consent for data collection is mandatory, with blanket consents no longer valid. Thus, LLM developers must ensure that any training data collected from Indian users meets these stringent consent requirements. Using user-generated data (e.g., social media content) without explicit consent could become legally risky,” Bindra, who closely watches the Generation AI space in India, said.
Mayuran Palanisamy, Partner, Deloitte India, said that organisations developing LLMs, whose success depends heavily on underlying data, face numerous challenges. “The challenges include but not limited to a growing user database, data visibility and classification, fragmented and inconsistent data, redundant data, insufficient data management practices, regulatory compliance demands, increased use cases, and a lack of data protection controls,” he said.
“While DPDPA and its rules provide a baseline for driving compliance, it is critical to implement an effective data strategy. This strategy is a combination of employing Data Governance, Information lifecycle management, Data Privacy and Data Security requirements for tackling broader challenges and adhering to the compliance requirements,” he felt.
Founder of fact-checking start-up Factly Rakesh Reddy Dubbudu, who has built a homegrown LLM called Tagore AI, said the litigation could be a by product.
“There is a difficulty in distinguishing between personal and non-personal information in LLM interactions, unlike traditional technology applications. While explicit information like name and email ids are clear, the content of user prompts and uploaded pictures, could contain highly personal data, is not easily monitored,” he said.
“A user can upload a picture of other people or an X-Ray and ask the LLM to provide information. We don’t have control over what people ask and this could pose a challenge,” he said.
Kashyap Kompella, AI industry analyst, felt that the DPDP Act, takes a more pragmatic approach than the one in Europe and strikes a balance between the need to safeguard data privacy and technological innovation.
“For instance, the provisions of the Act do not apply to data that is publicly shared by users themselves. Also, data localisation is preferred but data storage and processing outside India is permitted for legitimate purposes and with user notice and content,” he said.
“The EU has multiple and overlapping regulations that can impact AI development such as the EU AI Act, the GDPR, and the Digital Markets Act. It is interesting to note that due to uncertainty about their applicability, BigTech companies such as Apple, Meta, Google, and OpenAI have either not released or delayed the launch of their AI models and AI features in Europe,” he pointed out.
The Act also provides exceptions for personal data processing for academic and research purposes which don’t excessively constrain teams doing research and development, for example, of local LLMs.
Leave a Comment