Debdutta Guha Associate Principal, Security and Privacy Google LLC.
getty
AI, particularly GenAI, is revolutionizing how we interact with technology. From the conversational finesse of chatbots and virtual assistants to the creativity unleashed in personalized content and the power of advanced data analysis, AI is becoming deeply embedded in our lives.
With Big Tech racing to position itself at the forefront of this transformation, we stand on the brink of a profound evolution in human-technology interaction, led by GenAI.
But as someone entrusted with protecting user data at scale, I approach this AI revolution with both curiosity and caution. After 14 years of safeguarding user data across products and the internet, one truth stands out: The security and privacy of our data ultimately depend on the choices we make—both as developers and as users.
It’s time to dive deeper into those choices and the responsibilities they carry, especially as these choices relate to GenAI.
Interaction To Insights, Responsibly
Data powers large language models (LLMs) just as electricity powers computers—without it, nothing works. Over the past two decades, data has been both easy to collect and incredibly valuable. But I often ask myself: Where do we draw the line?
Consider Amazon employees and contractors listening to Alexa recordings, the Cambridge Analytica scandal or Clearview AI scraping social media for facial recognition databases. These practices raise undeniable ethical, security and privacy concerns.
As open-source AI platforms rise in popularity, data security and privacy must be non-negotiable priorities. Open-source thrives on transparency, but that also means the responsibility rests on us, the developers, to act ethically and protect user data.
Let’s dive into the risks of open-source AI and share actionable strategies to address them.
The Risks
Most of the risks posed by/to AI systems are the same as with any largely unrestricted data collection system. However, the scale of AI systems makes the difference. Here are a few of the main concerns:
• Unclear Data Harvesting/Retention Policies: AI’s hunger for training data has sparked an aggressive race for data collection, with developers constantly testing the limits of how much and what they can gather. The bigger issue? Users often have little control over what’s collected, how it’s stored or why it’s used—opening the door to misuse and unauthorized access.
• Misinformation, Bias And Discrimination: AI-driven deepfakes and fabricated narratives are fueling misinformation and smear campaigns, with the World Economic Forum’s 2024 Global Risks Report naming disinformation as the top global threat in the next two years. On top of that, biases in AI training datasets lead to troubling outcomes—like Amazon’s hiring tool penalizing women, biased credit scoring or facial recognition systems that amplify discrimination.
• Data Leaks And Security: An IBM survey revealed that only 24% of GenAI projects prioritize security—a worrying trend as speed trumps safety in AI development. With AI handling sensitive data, it’s a prime target for bad actors. In 2023, OpenAI faced a data breach, and Samsung banned ChatGPT after unintentional data leaks. Alarmingly, researchers show that poisoning just 3% of training data can cause up to 23% errors in model output.
Mitigations
The traditional triad of cnfidentiality, integrity and availability still applies to AI systems, but we must also focus on securing the entire AI pipeline to ensure models are reliable and trustworthy. While NIST’s report on Adversarial Machine Learning offers valuable insights on specific attacks and mitigations, it’s crucial to take a broader, bird’s-eye view of AI security.
Here’s how:
Governance, Risk And Compliance (GRC)
I’ve spent the last four years immersed in the world GRC, and I’m convinced it’s the foundation for secure AI. Instead of being an afterthought, GRC needs to be built into AI systems from the ground up.
With new AI regulations like the EU AI Act on the horizon, GRC-focused AI applications have a golden opportunity to stand out.
• Respecting Data Privacy: Only collect essential data, give users fine-grained control and always get their explicit consent. Ditch "opt-out" for "opt-in" data collection.
• Being Transparent: Maintain clear, concise and updated privacy policies that detail exactly how data is collected, used, shared and stored. (Shockingly, most third-party apps I’ve seen get this wrong.)
• Staying Accountable: Regularly audit systems to ensure compliance and security controls are truly effective.
Securing The Data Supply Chain
The focus should be on protecting every stage: data collection and processing, model development and training and live use in production.
• Shift Left: Shift vulnerability assessments “left” in the SDLC to ensure apps are secure from the start—something that’s still too rare even in 2024.
• Protecting The Data: Use strong data security controls like encryption, access control and compliance monitoring to safeguard training datasets.
• Maintaining Data Integrity: Perform regular audits of datasets to ensure they’re representative, diverse and free from bias.
• Detecting Anomalies: Set up automated and robust outlier/anomaly detection strategies.
• Tracking Everything: Keep detailed logs of data origins, transformations and handling processes for full transparency and traceability.
• Embracing New Techniques: Leverage methods like differential testing and federated learning to enhance security and privacy.
• Developing Incident Response Plans: Establish clear incident management and emergency response protocols for breaches.
The Future
The AI revolution, especially with the rise of generative AI, brings incredible possibilities—but also unprecedented challenges for data security and privacy.
We’re at a crossroads. To navigate it successfully, developers, policymakers and users must join forces to build a future where AI benefits everyone. This means:
• Establishing strong data governance frameworks that ensure responsible data collection, storage and use in generative AI applications.
• Creating adaptive regulatory frameworks that address the unique challenges posed by this rapidly evolving technology.
• Fostering collaboration and education across all stakeholders to promote responsible AI innovation and address societal concerns.
The future of AI is bright, but only if we prioritize data security and privacy. By proactively tackling these challenges head-on, we can unlock AI’s true potential and ensure these powerful technologies are developed and deployed ethically and securely for the benefit of all.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
{Categories} _Category: Implications{/Categories}
{URL}https://www.forbes.com/councils/forbestechcouncil/2024/12/16/every-choice-matters-data-security-and-privacy-on-ai-enabled-apps/{/URL}
{Author}Debdutta Guha, Forbes Councils Member{/Author}
{Image}https://imageio.forbes.com/specials-images/imageserve/671a873916d00c60ab1dcb24/0x0.jpg?format=jpg&height=600&width=1200&fit=bounds{/Image}
{Keywords}Innovation,/innovation,Innovation,/innovation,technology,standard{/Keywords}
{Source}Implications{/Source}
{Thumb}{/Thumb}