2011.08.18
Attending: Josh, Simon, Scott, Andy, Jason
Agenda
-
2 week Review
-
Scott Littlewood
- Meet and greet.
-
Governance ticket 6446
- Updated graffle
- Wishlist from Andy
- Making docs actionable
-
Scott Littlewood
-
Next 2 week plan
- "Secure Alias"
Notes
* Steering committee (Andy)
* All on board
* All tickets seem to match up
* Simon has found all the materials that Andy would have forwarded.
* 2 week Review
* Scott Littlewood
* Meet and greet.
* Andy: are you up-to-date from Simon? or first OMERO/HIC discussion?
* Elwood: brief history from Simon
* Andy: setting scene
* Scottish informatics program
* Working with longitudinal datasets for research/tertiary use
* Pharmaco-diligence and -genetics
* Computing problem and governance problem
* Conflicting agendas; currently focusing on governance
* see documents 6425 (project outline)
* Pop up to HIC to see data (also on sample server)
* Governance ticket 6446 at 14:15
* Updated graffle
* Andy: bang on in terms of how current system works from HIC perspective
* Need to play with the XML, but what's drawn is correct.
* Wishlist from Andy (6454)
* Simon: If you want data presented in a particular way (in websilo) tell us
* Jason: 6453, any guidance on 'pile of crap' v. 'gold'
* Andy: Calidcott report is key
* blueprint is draft, but will be defining the way forward
* "releasing micro data: to follow"
* Focus on: Audit trail
* Automating the rest is novel.
* Making docs actionable
* Andy will pester Alison et al. for doc prioritization
* Jason: any problem with list being public?
* Andy: not a problem.
* 6454
* Jason: napkin drawings?
* Andy: forwarded to Alison and Duncan
* Off the top of Andy's head:
* "what data source used by a researcher?"
* "...via which project?"
* "how many projects has subject X been in?"
* Simon:
* Great examples
* gives us something to hold on
* Next 2 week plan
* "Secure Alias"
* "Show me which projects I've been used by."
* But that won't be going in front of patients any time soon because of fears.
* Instead, is someone trying to de-identify an individual
* Another project: research register
* "I'm interested in doing research"
* "I want to opt out of drug studies"
* "I want to study diabetes"
* don't need uploadable dictionaries
* there will be anonchi hash only (illegal to provide chi)
* "data looks funny for some prochi?"
* send anonchi to themselves on the NHS side.
* audit trail: patients may get a business card with their anochi
* opt out for being in research; hide them from all projects
* Josh: imagine we have a long list (prioritized) we want to start checking off.
* Andy: without any doubt...
* What datasets have been used by which project?
* Which projects used by Helen?
* Gets into privacy impact assessment / proportional risk
* things to flag
* counts returned by any query is less than 5 (aggregates)
* Andrew's example
* new dataset owner
* "I want to give you this data"
* show me the data trail of usage
* like an OLAP to cross-reference any of the dimensions
* Need to produce another fake project from the same dataset
* Elwood: identifiable when anonymized?
* Andy: chi has date of birth
* Person can have multiple chi
* Create one number to link anochis without DOB
* Postcode & date of birth can be presented
* depending on research proposal
* instead "age at extract" or "+/- 2G days" or "YOB"
* zone of postcode for economic index
* offset postcode.
* SIM rather than postcode
* Andy: it's a fuzzy science
* the fear is the more data you give the likelier it becomes
* there's no real anonymization. call this "pseudo-anonymization"
* people don't like sharing personal data
* Andy: when creating a project extract, go to system and say, "is this safe?"
* Proportional risk
* Andy:
* some systems are going for lock down
* instead, be looser but blueprint includes heavy sanctions
* block within 10 minutes.
* Elwood: number of violations?
* Andy: yes, prevent linkages between two projects
* They will have the same data in both
* but they should treat them separately.
* Way forward (Josh)
* Definitely get separate projects
* Andy: about 10 projects active at one time.
* Then just get something working
* New data; "can you show us how safe it is"
* SMR in audit table
* Here are the projects
* Here's what's been done with it
* Here's what each researcher did (behaving?)
* 3 dimensions: dataset / researcher / project
* abstract usually says "response of X after 2 years ..."
* GoDARTS was much bigger. "Questions about diabetes"
* Possibly getting diagrams / graffles of possible workflow OK'd first.
-
Actions
- Andy: push steering on the documentation
- Andy: put in vague questions for 6454
- Andy: get more concrete questions for 6454 from Alison and Duncan
- Andy: also 6454 questions from steering committee
- Andy: add ... to steering committee agenda (from Jason)
- Andy: create audit xmls for different projects
- Simon: ticket "websilo: Violation report"
- Simon: ticket creating of projects