2011.05.14
Attending: Andy, Josh, Simon
Notes
Andy: status update for Andrew
Are there any blockers?
Anything that HIC can help with.
Simon: still working on script
Should be in the expected position from last time next week?
Data In / Out finished
Have been side-tracked by the Getting Started documentation
Josh: should also env up as a researcher-developer overview page
Simon: even better. Have been working with Will for trails
- Through the wiki for how to write scripts
Andy: important for down the road when OMERO team moves on
- Others will need to maintain
- Andy may need to start writing scripts even before the researchers.
Job adverts
Andy: have we heard anything?
@@josh: Access to Alfresco and/or a Zip of the files
GoDARTS/Data
Andy: Have a meeting to work out the exact workflow
Jason holding off for the input/output script (?)
Earliest will be June
Simon: would be godo to have some idea of the workflow
Andy: making sure data in/out for start of June
Andy: gone second half of June, may be a dead month
Josh: don't want anyone's expectations too high
Simon: was aiming for today, but hopefully caught up by the middle/end of next week
- Then set a soft deadline.
News/HW
Andrew has authorized a new physical host server
Current also hosts Citrix; resources will get shared between two boxes
Maybe move OMERO over and expand drive; a bit more flexility.
Looking at a data
scidc is diabetes
most important is CHI_master_index (one record per person)
then biochemistry
then HbA1c, bmi, prescribing
then smr01 (linkage to external data set)
can ignore descriptor at the moment
hardcode 100 rows for the moment
CHImaster_ALL is longitundinal, demographic data
- Table structure is different
- ALL has more than one record per person
gro (general register / death)
- death should match to CHI_master_index
scidc
- @@Andy will come back with prescribing joins
- Josh: might be interesting to have the join information in the schema as well
- Andy: had thought about a command-line tool to slurp in the files
- Josh: same here.
Andy: This is typical of any experiment. Every joins on prochi, with other tables
Josh: do we need to reify these other concepts?
Andy: not for the moment.
Keep in mind
- Variable names in the column names are not
- GoDARTs team likes their names in a certain style
- @@Josh: aliasing support in storage.
- @@Andy will be check what columns have been changed from underlying table
- Josh: May be a place where want to use LSID
-
Andy: new extract of GoDARTS data is in process
- Hopefully we can work with that clean data
- At that time we could add "original_name"
-
Josh: need to handle multiple aliases?
- Andy: too complicated for a pilot.
Simon: currently working on uploading
- Josh: any column types missing?
- Simon: not yet.
- Andy: data types - integers, reals, floats,
-
Andy: one issue is an integer column with null handling.
- Josh: how many columns don't have a NULL-valid value
-
@@Andy: perhaps add PositiveInt,etc. values to the schema
- Not a short-term doable; that doesn't exist.
-
Andy: hic extracts are DAY/MON/YEAR, but specified in header
- also erroneous dates!
- Josh: do we store them as strings?
- No. Store as date. Though the researcher may tell us otherwise.
- but all in the same timezone
-
Josh: string lengths?
- Andy: Need to work out what the longest value was.
- 255 tends to work
- but it's trial and error
- often come out with a lot of space around it.
- prochi however is 10.
- @@Andy will ask around
Josh: Looks we're approaching a common description language.
- Andy: already used extensively internally
- ... to generate Word DOCs, etc.
- All in C#
- May be able to make things nicer on the Windows side.
Misc
Next talk May 27