Partially Redacted: Data, AI, Security, and Privacy
Partially Redacted brings together leaders in engineering, data, AI, security, and privacy to share knowledge, best practices, and real world experiences. Each episode provides an in-depth conversation with an industry expert who dives into their background and experience. They’ll share practical advice and insights about the techniques, tools, and technologies that every company – and every technology professional – should know about. Learn from an amazing array of founders, engineers, architects, and leaders in the data and AI space. Subscribe to the podcast and join the community at https://skyflow.com/community to stay up to date on the latest trends in data and AI, and to learn what lies ahead.
Episodes
Wednesday Nov 30, 2022
Wednesday Nov 30, 2022
Companies use bug bounties and penetration testing to proactively look for vulnerabilities in their systems. These programs should be part of any security conscious organization.
However, even with these systems in place, it can be difficult to stay ahead of the hackers and potential attacks. Additionally, the tools available for running penetration tests can be complex to run and often require using a combination of tools.
Former pentester and bug bounty hunter Nenan Zaric joins the show to talk about the types of vulnerabilities that companies should be looking for and about how to automate security workflows through the Trickest platform, a company he founded. Nenad's advice from years of cybersecurity work is to be proactive and always attack yourself so that you can find the problems before the attacker does.
Topics:
Tell me about your history as a former penetration tester and bug bounty hacker. First, what is pentesting and bug bounty hacking
How did you get into this field and learn the skills necessary to hack into systems?
What are some of the common mistakes companies make when it comes to security that allows a hacker to penetrate their security?
How should companies think about protecting themselves and their customer data to prevent such attacks?
What is Trickest?
Is the idea of automating security workflows something new? How is this traditionally done and how does Trickest improve on the traditional model?
What are the common use cases that people are using Trickest for?
What sort of common attacks does Trickest help prevent?
Who’s your typical user? Is it a white collar hacker, like a pentester, or is it security professionals within an organization wanting to have an automated system for testing their security?
Walk me through how to set up an automated security workflow. What’s the output of a workflow?
How do the ready-made workflows work? What are some examples?
Where do these security workflows fit with the overall development process for a company? Is this an on-going thing that should be continually run and tested for weaknesses?
A large amount of attacks have a human element, social engineering and such, how does Trickest help prevent such attacks?
What are your thoughts on the future of security? Are we getting better at protecting and locking down systems?
What’s next for Trickest? Anything in the future roadmap that you can share?
Resources:
Trickest
Wednesday Nov 23, 2022
Wednesday Nov 23, 2022
Edge devices are hardware devices that sit at the edge of a network. They could be routers, switches, your phone, voice assistant, or even a sensor in a factory that monitors factory conditions.
Machine learning on the edge combines ideas from machine learning with embedded engineering. With machine learning models running on edge devices amazing new types of applications can be built, such as using image recognition to only take pictures of the objects you care about, developing self-driving cars, or automatically detect potential equipment failure.
However, with more and more edge devices being used all the time that might be collecting sensitive information via sensors, there are a number of potential privacy and security concerns.
Dan Situnayake, Head of Machine Learning at Edge Impulse, joins the show to share his knowledge about the practical privacy and security concerns when working with edge IoT devices and how to still leverage this incredible technology but do so in an ethical and privacy-preserving way.
Topics:
What’s your background and how did you end up as the head of machine learning at Edge Impulse?
What is an edge device?
What is Edge Impulse and what are the types of use cases people are solving with AI on edge devices through the Edge Impulse platform?
What are the unique security challenges with edge devices?
Since these devices are potentially observing people, collecting information about someone’s movements, what kind of privacy concerns does someone building for these devices need to think about?
Are there industry best practices for protecting potentially sensitive information gathered from such devices?
Is there research into how to collect data but protect someone's privacy when it comes to building training sets in machine learning?
What happens if someone steals one of these devices? Are there safeguards in place to protect the data collected on the device?
Where do you see this industry going in the next 5-10 years?
Do you foresee security and privacy getting easier or harder as these devices become more and more common?
Resources:
Edge Impulse
AI at the Edge
Wednesday Nov 16, 2022
Wednesday Nov 16, 2022
The Payment Card Industry Data Security Standard (PCI DSS) is an information security standard for organizations that handle branded credit cards from the major card schemes. It was introduced to create a level of protection for card issuers by ensuring that merchants meet minimum levels of security when they store, process, and transmit cardholder data and ultimately reduce fraud.
Merchants that wish to accept payments need to be PCI compliant. Without PCI compliance, the merchant not only risks destroying customer trust in the case of a data breach, but they risk fines and potentially being stopped from being able to accept payments.
Payment processors like Stripe, Adyen, Braintree, and so on, help offload PCI compliance by providing PCI compliant infrastructure available through simple APIs.
Bjorn Ovick, Head of Fintech at Skyflow, formerly of Wells Fargo, Visa, Samsung, and American Express, holds over 20 patents related to payment applications. He joins the show to share his background, thoughts on the evolution of technology in this space, break down PCI DSS, payment processors, and how Skyflow helps not only offload PCI compliance but gives businesses flexibility to work with multiple payment processors.
Topics:
Can you share a bit about your background and how did you end up working in the financial industry?
You also have over 20 patents for payment applications, what are some of those patents?
So you are Head of Fintech Business and Growth at Skyflow, what does that consist of and how did you come to work at Skyflow?
Can you talk a bit about the evolution and change of the fintech market from when you started your career to today?
What is PCI DSS and where did it come from?
How does a company achieve PCI DSS compliance?
What’s a company’s responsibilities with respect to PCI compliance?
What's it take to build out PCI compliant infrastructure?
What happens if you violate PCI compliance?
How do you offload PCI compliance and still accept payments?
What is PCI tokenization?
What patterns do you see in payments and what should someone consider as they build their payment stack?
Why would a merchant use multiple payment processors?
How does a company use multiple payment processors?
What is network tokenization and how does that improve privacy and security?
What is 3D secure?
What are the big gaps in terms of payment processing today? What problems still need to be solved?
Where do you see the payment technology industry going in the next 5-10 years?
Where should someone looking to learn more about the payments space go?
Resources:
Network Tokenization: Everything You Need to Know
Multiple Payment Gateways: The Why and How
Wednesday Nov 09, 2022
Wednesday Nov 09, 2022
DevOps is a concept that has exploded in the past few years, allowing software development teams to release software and automate the process. This decreases time to market and speeds up learning cycles. Continuous Integration and Continuous Delivery (CI/CD), automates the software delivery pipeline, continuously deploying new software releases in an automated fashion.
But when we deploy code quickly, it's imperative that we don't ignore the security aspect from the beginning. Ideally, we shift security left and incorporate it into the pipeline right from the start. This reduces software vulnerabilities and makes sure our cloud resources are configured following the best practices in terms of security.
Google Cloud Principle Architect Anjali Khatri and Google Cloud Solutions Engineer Nitin Vashishtha join the show to discuss DevOps, DevSecOps, the shift left movement, and how to use Google Cloud to create a secure CI/CD pipeline.
Topics:
How has the cloud changed the way people need to think about architecting secure systems?
How does the scale of cloud potentially impact the scale of a security or privacy issue?
What is DevOps?
Why is this area so hot right now?
What problems has the DevOps movement helped solve that were traditionally difficult or impossible to address?
How does the Shift Left movement for security relate to what’s happening in DevOps?
What is DevSecOps?
How does DevSecOps fit into a company’s overall security and privacy program and strategy?
When it comes to things like CI/CD, what are the common mistakes people can make when it comes to security, privacy, or compliance?
Cloud Build is a serverless CI/CD platform, why do I need something beyond this to automate my pipeline?
What other Cloud tools and components should I be using to make sure my CI/CD system is not only able to support my team’s day to day development but is actually secure?
Can you talk about Artifact Registry and what that product means in terms of security?
How does Cloud’s Binary Authorization system work? Why would I use it and how does that improve my security posture?
Does the addition of security as part of say my CI/CD pipeline impact performance in a meaningful way?
Can you walk me through what the CI/CD process looks like using the combination of Cloud tools and resources?
How much knowledge and experience do I need to set this up?
How does a combination of tools like this play with configuring Cloud resources directly within the Google Cloud Console?
Are there Cloud products that help me lock down my source code?
Are there Cloud products that automatically scan my code for security or privacy vulnerabilities?
What are your thoughts on the future of cloud security?
Are there technologies in this space that you are particularly excited about?
Where should someone looking to learn more DevSecOps and cloud security?
Resources:
Building a secure CI/CD pipeline using Google Cloud built-in services
Introducing Google Cloud's new Assured Open Source Software Service
Software Delivery Shield overview
Cloud Workstations
Identity & Security
Google Cloud Security Best Practices
Wednesday Nov 02, 2022
Wednesday Nov 02, 2022
Over the past 20 years, there's been tremendous growth in technology for digital health. From healthcare management software, medical devices, to fitness trackers, there's more health data available about an individual than at any other time.
However, with an increase in data, there's also been an increase in considerations for the secure management of this data. Privacy regulations haven't been able to keep up with the explosion of technological growth.
Jordan Wrigley, Researcher for Health and Wellness at the Future of Privacy Forum, joins the show to share her expertise about digital health data privacy. Sean and Jordan discuss the goals and activities of the Future of Privacy Forum, how culture impacts how an individual thinks about health-related privacy, the shift in concern over health data privacy, and what a company needs to be thinking about when building products that collect or process digital health data.
Topics:
Who are you? What’s your educational background, work history, and how you ended up where you are today?
What is the Future of Privacy Forum?
What are the goals, activities, and focus areas of the organization?
How do people and companies typically engage the FPF?
What is your role and area of expertise at the FPF?
Do you think there’s been a shift in privacy sensitivity with regards to medical and health data in the past few years and if so, what has led to the growing concern and focus?
What’s considered health-related data when it comes to privacy regulations?
What types of tools/techniques should a company be considering to improve their privacy and security posture when dealing with health data?
Let's say I'm a gym own. What do I need to know about my responsibilities in terms of privacy when it comes to the collection and management of health-related data?
Where is the line between a fitness tracker and an actual medical device? And should these trackers have more regulatory demands placed on them?
What are the regulatory requirements for an actual medical device?
If I’m processing clinical trial data and I want to be able to perform analytics on the data and produce sharable reports, what do I need to know about maintaining privacy in this scenario?
How does developing for children impact the types of privacy and security considerations that a company needs to be thinking about?
What are your thoughts on the future of privacy? Are there tools, technologies, or trends that you’re excited about?
What are some of the big challenges in privacy that we need to solve?
Resources:
Future of Privacy Forum
Wednesday Oct 26, 2022
Wednesday Oct 26, 2022
Passwords have been around since the 1960s and as a means to keep someone out of a non-connected terminal, they were relatively secure. The scale of a compromised system was relatively low. But the world has changed drastically in that time. Every computer is connected to a massive network of other computers. The impact scale of a compromised password is multiple times more problematic than it was even 30 years ago, yet we continue to rely on passwords as a security means to protect account information.
Security means like longer passwords, more complicated schemes, no dictionary words, and even two-factor authentication have had limited success with stopping hacks. Additionally, each of these requirements adds friction to a user accomplishing their task, whether that's to buy a product, communicate with friends, or login to critical systems.
WebAuthN is a standard protocol for supporting passwordless authentication based on a combination of a user identifier and biometrics. Consumers can simply login via their email and using their thumb print on their phone or relying on facial recognition on their device. Passwordless authentication not only reduces frictions for users, but it removes a massive security vulnerability, the password.
Nick Hodges, Developer Advocate at Passage, joins the show to share his knowledge and expertise about the security issues with traditional passwords, how passwordless works and addresses historical security issues, and how Passage.id can be used to quickly create a passwordless authentication systems for your product.
Topics:
What’s the problem with passwords?
Why have passwords stuck along so long?
What’s it mean to go passwordless?
What is a passkey and how do they work?
How does the privacy and security of a passkey compare to a standard password?
A Passkey is stored within the Trusted Platform Module of a phone. What happens if someone steals my phone?
What happens if I upgrade my device? Do my passkeys come with me?
What are the potential security risks or limitations of passkey based login?
What if I don’t have my phone? Can I still login?
Can you share an account with someone else? How does that work?
When a business switches over to using a passkey approach, what’s the reaction from their customers?
Is there a big educational challenge to convince companies to ditch passwords?
Why is a passkey approach to login not more widely adopted? What’s stopping mainstream use?
What is Passage and how is helping businesses go passwordless?
Who’s your typical customer? Startups just building their auth system or are people replacing existing systems for this approach?
What’s it take to get started? How hard would it be for me to rip out my existing authentication and adopt Passage?
What are your thoughts on the future of passwords and password security? How far away are we from completely getting rid of passwords?
What’s next for Passage? Anything on the future roadmap that you can share?
Resources:
Passage
Passage Demo
Connect with Nick
Wednesday Oct 19, 2022
Wednesday Oct 19, 2022
Differential privacy provides a mathematical definition of what privacy is in the context of user data. In lay terms, a data set is said to be differentially private if the existence or lack of existence of a particular piece of data doesn't impact the end result. Differential privacy protects an individual's information essentially as if her information were not used in the analysis at all.
This is a promising area of research and one of the future privacy-enhancing technologies that many people in the privacy community are excited about. However, it's not just theoretical, differential privacy is already being used by large technology companies like Google and Apple as well as in US Census result reporting.
Dr. Yun Lu of the University of Victoria specializes in differential privacy and she joins the show to explain differential privacy, why it's such a promising and compelling framework, and share some of her research on applying differential privacy in voting and election result reporting.
Topics:
What’s your educational background and work history?
What is differential privacy?
What’s the history of differential privacy? Where did this idea come from?
How does differential privacy cast doubt on the results of the data?
What problems does differential privacy solve that can’t be solved by existing privacy technologies?
When adding noise to a dataset, is the noise always random or does it need to be somehow correlated with the original dataset’s distribution?
How do you choose an epsilon?
What are the common approaches to differential privacy?
What are some of the practical applications of differential privacy so far?
How is differential privacy used for training a machine learning model?
What are some of the challenges with implementing differential privacy?
What are the limitations of differential privacy?
What area of privacy does your research focus on?
Can you talk a bit about the work you did on voting data privacy
How have politicians exploited the data available on voters?
How can we prevent privacy leakage when releasing election results?
What are some of the big challenges in privacy research today that we need to try to solve?
What future privacy technologies are you excited about?
Resources:
Dr. Yun Lu's research
The Definition of Differential Privacy - Cynthia Dwork
Differential Privacy and the People's Data
Protecting Privacy with MATH
Wednesday Oct 12, 2022
Wednesday Oct 12, 2022
When managing your company’s most sensitive data, encryption is a must. To fit your overall data protection strategy, you need a wide range of options for managing your encryption keys so you can generate, store, and rotate them as needed.The risk of sensitive data being misused or stolen can be limited, as long as just the people (and services) who are authorized to access data for approved purposes can access the key. Without proper management of encryption keys robust encryption techniques can be rendered ineffective. So, while encryption is a core feature of any effective data privacy solution, encryption only enhances data privacy when paired with effective key management.Osvaldo Banuelos, lead software engineer at Skyflow, joins the show to share his knowledge and expertise about encryption key management and its role in modern data privacy.Topics:
Can you share some of your background from where you started your engineering career to where you are today?
How did you end up with an interest in working in the data privacy space? And what’s your work history in this space?
What are the fundamentals of encryption?
How do you protect against the key being leaked and rendering encryption ineffective?
What is an encryption key management system?
What are the components of an effective key management system?
How often should a key be rotated and does a KMS provide that functionality out of the box?
How does key rotation work?
Who within an organization is typically responsible for key management?
What are some of the popular KMS systems on the market today?
From a feature perspective, are they all the same or are their pros and cons to one over the other?
How does key management work in Skyflow?
When it comes to data privacy, you likely want to protect your customer data using encryption and to do that effectively, you need a robust key management system. Is this enough or are there other technologies a company would need to incorporate to have an effective privacy and compliance strategy?
What are the big gaps in data privacy today? What future technology or development are you excited about?
Where should someone looking to learn more about the data privacy space begin?
Resources:
Encryption Key Management and Its Role in Modern Data Privacy
Wednesday Oct 05, 2022
Wednesday Oct 05, 2022
Joe McCarron, with prior roles at Zendesk and Apollo GraphQL, has spent much of his career thinking about and building products for developers. Today, he serves as the product lead for Skyflow's Vault, APIs, and developer experience.
In this episode, Joe discusses how his undergrad in Political Science and career working on developer-first products led him to Skyflow. Sean and Joe discuss tokenization and encryption, how they are different, the problems they solve, and what every engineer should know about these techniques.
Topics:
What is your background as a product manager?
How did you end up with an interest in working in the data privacy space? And what’s your work history in this space?
What is tokenization?
What is encryption?
How are they different?
What problems does each solve?
Do they compete with each other or are they complementary?
How much does your typical engineer need to understand about encryption and tokenization?
What are the big gaps in data privacy today? What future technology or development are you excited about?
Where should someone looking to learn more about the data privacy space begin?
Resources:
Demystifying Tokenization
Wednesday Sep 28, 2022
Wednesday Sep 28, 2022
Dr. Lorrie Cranor began her career in privacy 25 years ago and has been a professor at Carnegie Mellon University in the School of Computer Science for 19 years. Today, she serves as director and professor for the CMU privacy engineering program.
In this episode, Dr. Cranor discusses how she started her career in privacy and then eventually moved into academics. She talks about the history of the CMU privacy engineering program, what the program entails as a student, and the career opportunities available to graduates.
Dr. Cranor's area of research focuses on the usability of privacy and privacy decision making. She discusses several recent studies looking at how real world users understand and navigate cookie consent popups and design best practices for companies. She also explains privacy labels and how developers building applications on iOS and Android can do a better job creating these labels.
We also discuss the future of privacy education and technologies, touching on the responsibilities of companies and privacy-enhancing technologies like differential privacy.
Topics:
How did you get interested in security and privacy and start working in this field?
What’s the history of CMU’s Privacy Engineering Program? How did it start?
Which department is the program part of?
If I’m taking the Master’s degree program, what does that consist of?
What’s the typical undergraduate background of someone taking the Master’s degree program?
Do graduates typically end up working as privacy engineers and what sort of companies do they end up at?
What’s the difference between the Master’s program and the certificate program?
How has engagement with the privacy program changed over the past decade?
Should privacy education be part of a standard software engineering undergraduate program?
How would you describe your areas of privacy research and the types of problems you’re interested in studying?
What have you discovered about how individuals make privacy-related decisions?
How can companies go beyond the bare minimum in terms of communicating privacy choices to their users?
Privacy choices are notoriously difficult to navigate and understand, what does your research help teach us about improving the usability of UX for privacy controls?
How can you test privacy choice? Does the collection of test data potentially violate someone’s privacy?
What is a privacy nutrition label and what problems is it meant to address?
Starting in 2020, Apple started using this concept by requiring that all apps in the Apple app store include a privacy label. Labels are self-generated by the app developer. How good is the resulting privacy label if the developer lacks privacy training and education?
What are the common mistakes developers are making with creating these privacy labels?
What advice do you have for developers so that they can create an accurate privacy label?
Cookie consent overlays and popups are now very common. What event led to the introduction of these consent dialogs for consumers?
What problems have you discovered with the usability of cookie consent screens?
Do we need privacy regulations like GDPR to be more prescriptive in terms of how you meet their requirements, which could include usability guidelines for something like cookie consent?
Thoughts on the future of privacy engineering?
What are your predictions about privacy education and awareness over the next 5-10 years?
Resources:
CMU's Privacy Program
Dr. Cranor's Research
Related episode:
Data Protocol’s Privacy Engineering Certificate Course with Jake Ward