To prevent the large statistical analyses overloading the server, they are queued (reminding me of old fashioned batch processing). The queuing is done using RabbitMQ. Then RESTful is used to interface to the clients (simpler than SOAP). The Django web framework was used with Python to tie all the components together.
I had difficulty understanding the technical details of Richard's explanation of how the system works, as not I am a Python user (I felt he was giving the equivalent of five conference talks at once). This reminded me a little of LaTex for typesetting, but applied to statistics.
However, the issue of having to carry out complex and very precise calculations on large amounts of data which effects large payments is familiar to me from work at the Commonwealth Schools Commission to calculate billions of dollars in payments to schools.
What was exciting was Richard's broad vision of how such systems could be used for corporate and public benefit. He mentioned Google's Content Identification and Management System which allows rights owners to "Choose, in advance, what they want to happen when those videos are found. Make money from them. Get stats on them. Or block them from YouTube altogether...".
One interesting comment was about how important the payments from the Copyright Agency were to Australian authors.
Richard then talked about how a Kanban Board is used for the software development.
Richard will be at John Maindonald's Statistical Learning and Data Mining using R course at ANU in Canberra 17 to 21 January 2011.
No comments:
Post a Comment