Paul Webb examines the options for data analysis on RISC OS.
We live in a world that is awash with data. One only has to think of the importance of School League Tables or the FTSE index to appreciate that information of all types has an increasingly prominent role to play in our modern lives.
But do we have to accept the 'spin' on data which is presented to us in the media? Is there any way in which we can systematically evaluate accepted practices?.
Fortunately, RISC OS users can answer 'no' to the first question and 'yes' to the second question because of the availability of 1st from Serious Statistical Software (SSS) and Analysis from Giovanni Lo Conti - two contrasting but complementary data analysis packages.
The Quantitative and Qualitative Divide
So why look at two packages rather than one? Well, 1st and Analysis meet different needs. 1st enables the user to examine data which is usually quantitative or numerical in nature whereas Analysis can also deal with qualitative or textual data. Think for example of a person who tells a researcher how many hours he works each week and what he 'feels' like when he is working. The former piece of information - number of hours worked - is an example of quantitative data whereas the latter piece of information - a description of his emotions - exemplifies qualitative or textual data.
Of course the quantitative/qualitative distinction does not always apply. The quantitative data analyst may occasionally deal with examples of categorical data like social class or gender and the qualitative analyst may obtain numerical information from text.
But, in general, the distinction remains a useful one to bear in mind as you read this article.
1st - or Fully Interactive Regression STatistics to give the applicaton its full title - arrives on two discs. The Program Disc contains the application itself whilst the Manual Disc contains an interactive manual and a demonstration program which shows off 1st's capabilities.
Installation is simplicity itself and merely entails creating a directory on your hard disc as a prelude to copying the contents of both discs into it. Users without a hard disc are also able to run 1st from the Program Disc.
Activating 1st is very straightforward. The user simply double-clicks on the 1st icon and the application's icon appears on the icon bar. CSV and 1st files can then be dragged onto the 1st icon at which point 1st displays your data in the form of a data matrix or sheet. The size of the sheet can also be specified where necessary. Entering data is likewise very straightforward. All you need do is click on an individual cell to enter a value. Figure one below shows a sample sheet which was generated with one of the data sets supplied with 1st.
1st seems to be able to perform an extremely comprehensive range of statistical calculations on any data set. One can for example scrutinise one variable at a time or plump for a more ambitious multivariate analysis. What 1st will not do however is generate meaningful results from data which are meaningless. It therefore pays to plan your study very carefully before actually collecting and analysing any data. 'Garbage in garbage out' is an expression which the budding data analyst would do well to remember.
1st's statistical facilities can be accessed by clicking the middle mouse button over the sheet which reveals a 'Data Ctrl' menu from which a range of statistics can be selected (see figure two below).
1st also generates textual or graphical report windows on the basis of the analysis that the user selects. Figure three (a Draw file) shows a regression plot which was generated after following one of the supplied tutorials whilst the information in report one was produced by generating a textual report window and saving it as a text file.
Analysis is, in contrast, essentially a textual analysis program. It supports a wide range of analytic procedures including concordances, keywords-in-context (kwic) and the identification of co-occurrences. Figure four below shows the main window which is used to drive the application.
Installation is just as straightforward as it is with 1st. Analysis is simply placed in its own directory on your hard disc and run in the usual RISC OS way. The user basically double-clicks on the Analysis icon and the application's icon appears on the icon bar.
Analysis is then activated by dragging and dropping the plain text file that you wish to study onto the icon bar icon. The main window consequently appears and the user selects the required analysis from the available menu before clicking on the 'OK' button. The application finally produces an output file which contains the results of the analysis.
As Analysis makes such an eclectic range of techniques available to the researcher, its target audience will consequently be varied although it will be particularly useful to anyone interested in linguistics. However, researchers who deal with qualitative data can make use of its facilities.
Think, for example, of the researcher who wishes to investigate the values which a group of people hold. After interviewing each person the researcher would carefully read through each interview transcript in order to identify points of commonality. Re-occurring themes would then be noted in the margin of each transcript before considering how they were interrelated. In this way, the purely qualitative researcher would be able to construct a picture of the 'conceptual universe' of the respondents by discovering key categories or values.
Moreover, Analysis could assist in this endeavour by making use of its 'List Frequency of Words' option. By counting word frequencies, Analysis allows the researcher to discover points of commonality between interview transcripts in a much more effective way than would be possible by traditional pen and paper methods. Figure five below illustrates the output which is produced after applying this option to a sample file.
So how do 1st and Analysis measure up?
I have no hesitation in recommending 1st or Analysis to the readers of RISC World. 1st is a superb piece of software which provides all the statistical facilities that the professional researcher or student is ever likely to need. If the RISC OS market recovers (and I for one remain optimistic) it should meet the statistical needs of many a high school and university student. One has only to think of the popularity of A-Level subjects like Maths with Statistics or Sociology and Psychology to appreciate that there is a potential market ready to be tapped. I am however, slightly more circumspect in my praise for Analysis. Although it has the makings of an excellent textual analysis package, some important analytic techniques are missing.
But it seems uncharitable to be too critical. After all, both programs are obviously a testament to the commitment of RISC OS programmers to our platform. So if you are thinking of getting into Data Analysis for RISC OS, check out 1st and Analysis. You won't regret it.