2011.08.01

Attending: Josh, Simon, Andy, Jason

* Josh: can we figure out what you'd want regardless of OMERO
* Andy: slippery slope of different desires
* SqlServer v. OMERO server
* Andrew started out by wanting to query what's happening with datasets
* Knock on implications
* Team is used to SQL
* e.g. longit. to 5000 columns per person
* 40-50 queries, aliasing, subsets, aggregates, one join in every query
* Trying to ignore the data mgmt. and not change their part of work
* Think carefully about what's possible, based on the scope
* Technically feasible versus usable
* They just use SQL to extract data from large tables into subsets
* Ask them: "Anything in SQL select statement, temp table
* Don't want to learn new syntax.
* Managing 1 year pilot have to keep things separate.
* Just go with OMERO model
* Longer-term: risk if we don't have SQL interface, even governance people think it's great.
* Josh: they can create tables, etc? Definitely.

* Simon: different people
* Prepairing/cleaning in NHS sql server (out of scope)
* HIC data analysts/admins (Chris, Al., Andy, 1 more and 2 trainees) --> Flat files (e.g. OMERO --> OMERO)
* Project silo: researchers looking at their own data
* Governance users: guardians at data source or patients, "who's using my data"

* Andy: Researchers don't tend to know SQL
* andy:

* Governance
* Terms
* Disclosure (done on aggregates
* Proportionate ...
* All being made up as we go along
* e.g. "privacy risk of joining 2 tables"?
* talking about that in SHIP, but it's not defined.
* no one's tried to automate it before (That's the ticket!)
* Andrew seems to be more focusing on the audit trail
* powerstation like feed
* Perhaps a layer to manage the various auditing sources.

* Big face-to-face
* more about us sitting in a room rather than steering committee
* external auditing about how the datasets are used
* not currently much done at the project level.

* To decide (Josh)
* schema of the auditing and example data
* Andy: but that's also what we need to decide.
* Simon: how static is that info?
* Think so: what datasets used by what project / researcher
* Andy: e.g.
* Project silo 267
* know which tables a researcher CAN access, but not that they have.
* definition of governance
* Andy to put up links to best practice documents (ISDs current)
* as if it's only ever been used on aggregates.
* wiki page with
* various auditing information in SqlServer
* the automated oversite pieces (PM service, etc.)
* graffle of the audit data flow
* best practices and SHIP drafts
* definition of XML exchange format (chi mappings, security, etc)

* Jason: what is "the audit info"?
* Andy: simplest level (current) who can or cannot access what
* Andrew was saying in steering meeting
* any question along 3 dimensions
* who or what has used any data from a dataset (rows per past month)
* what has project 267 used
* what has the subject level used (patient, give me an audit trail of researcher X)
* audit info from SqlServer into OMERO?
* One part, yes.
* ...josh went through the whole spiel...
* Jason: why are we trying to preserve what you guys are doing?
* i.e. imagine where we are 18 months from now.
* Are the DCs SqlServers or CSV files?
* Andy: ignoring NHS SqlServer (out-of-scope)
* have more stable data (static for a month, etc.) on the uni-side
* cleanest: SqlServer push to OMERO via a long-query
* benefit is its a quick solution.
* SQL because data analysts use complicated queries
* and there'd be a re-training and implementation problem.
* perhaps in a year's time. (trying not to break everything at once)
* Researcher might query
* boxi and business objects, then out-of-scope of OMERO
* (don't particularly like it though)
* Josh: also a concern for us since then there's something missing from the end product
* Jason: again, generic DC is producing data in what format?
* Andy: flat files --> SqlSever --> flat files
* XML --> Sql query

Document Actions

Print this

Sections

Personal tools

2011.08.01

Document Actions