Chapter 10 · Data Analyst

Handling data responsibly

~5 min read

Most training teaches you to query data and ignores the question of whether you are allowed to. Mishandling personal data is a legal and ethical risk, not just a technical one, and it is increasingly tested in interviews. This chapter covers what a data analyst specifically needs. A separate cross-role Compliance and Domain Knowledge guide goes deeper across industries; here we focus on the habits that apply while you write SQL and build reports.

10.1 Know what you are holding#

Personally identifiable information, or PII, is any data that can identify a specific person, directly (name, email, national ID) or indirectly in combination (birth date plus postal code plus gender can re-identify someone). Two subsets carry extra rules: health data under HIPAA and payment-card data under PCI DSS. Many privacy laws now also single out a sensitive category, including government IDs, precise location, and biometric data, for stricter handling.

10.2 Safe-handling habits#

  • Pull the minimum. If you do not need the email to answer the question, do not select it. The safest field is the one you never queried.
  • Prefer aggregated or de-identified data. Work at the level the question needs. A trend by region rarely requires individual records.
  • Keep regulated data in approved systems. Do not export PII to a local spreadsheet, a personal drive, or an unapproved AI tool.
  • Respect access and purpose. Having query access to a field is not the same as being allowed to use it for this purpose.
  • Ask before, not after. If you are unsure whether a field is permitted, check with data governance before you run the query.

10.3 Using AI tools responsibly#

  • Always validate the output. AI can produce confident, wrong queries. Check results against known totals exactly as you would your own work, and be able to explain any generated query line by line, because you own the result.
  • Never paste regulated data into an unapproved tool. Feeding PII, health, payment, or confidential records into a public AI tool can be a serious data breach. Use only employer-approved, governed environments for anything touching sensitive data.

10.4 Domain knowledge#

IndustryKnow that it existsWhy it matters
Banking / financeFFIEC reporting; Federal Reserve FR Y-14 stress-test dataRegulatory reports demand accuracy and lineage
HealthcareHIPAA and protected health informationDefault to de-identified data; strict access
Retail / e-commercePCI DSS for card data; retail metricsNever store raw card data; know the KPIs
TechnologyGDPR and CCPA on user dataUser-level data is tightly access-controlled

Get the next chapter and weekly interview tips by email

One short email per week. Skim in a minute. Unsubscribe anytime.