# The difference between data and information

Final Exam

Presented to

Institution

The difference between data and information

Data and information are two most commonly used words in any research, whether qualitative or quantitative. These two words sometimes interchange their usage, but have extremely different meanings and representations in a research. The difference between data and information is found in their definition and applications. Data refers to raw or unanalyzed facts, figures and events in a quantitative research. Information is the useful knowledge collected from the data. In most cases, data is presented in statistical form although; some researchers present it in a text format. Moreover, data results form a collection process that varies depending on the type of data required by a research. When the collection period is offer a large amount of data is presented that requires analysis and presentation. A well analyzed and organized data ready for presentation is referred to as information. In most cases, researchers provide users with the information since it provides an easier way of interpreting the research objectives and aims.

In general, data is information that has not yet been transformed into its meaningful form so as to meet the requirement of its intended users. This introduces another difference between data and information. Data is presented for analysis in the form of text documents, spreadsheets, and statistical charts while the gathered information must be interpreted for easier understanding by the intended users. On the other hand, data and information are differentiated in the way they are transformed, presented, and their context. For example, a research conducted on testing the usage of a new milking machine might require 50 farmers to respond. The researcher conducts a survey and yields 40 complete questionnaires. Each of the 40 questionnaires contains data that has no context in it and requires proper presentation. When the data from 40 respondents is compiled and presented in a recommended format, it becomes information. Information is presented in the form of graphs and charts.

N-O-I-R data measurement scales

Quantitative research has four scales of measurement namely; nominal, ordinal, interval and ratio (N-O-I-R). The letters NOIR presents the measurement scales in the ascending order. Nominal forms the lowest level sometimes referred to as the ‘naming’ level. For example, in an experiment to determine salary scales in an organization the group asked to name their marital status. In such a case, the group will respond with words describing their marital status (‘married, single, divorced, etc). In addition, nominal data has no sequence of a particular response since it has no logical basis of classifying a particular category as more superior to the other.

Ordinal is the second level of scale measurement whereby the subjects are arranged in order, from high to low. In ordinal level, respondents are not presented by names but are given ranks like, 1, 2, 3, 4 or A, B, C, D. From the above example, the researcher might rank the employee’s salary from highest to the lowest paid. The first rank is named 1 representing the person with the highest salary, 2 representing the second highest, and the trend continues till the last person. On the ordinal scale, the given ranks have no command over the differences between subjects. For example, if John has rank 1 and Mercy rank 2, there is no known difference between their salaries.

The third scale is called interval measurement. The interval level takes into account the distance between two measurements is similar. For example, if the salary scale level for rank 1 is $700 and rank 2 is $650, then the interval is the same as that of rank 5 with $500 and 6 with $450. The $50 interval also has the same property of rank ordering. On this level, the difference between any two ranks is known since the $50 different is the same regardless of the point on the scale. On the other hand, the interval scale measures a single point that is worth the aspect being measured because all points are equal to all other points.

The ratio scale is the highest level of measurement that contains properties of rank order and forms equal distances with a property of having an absolute zero point. Both the ration scale and interval scale have equal intervals between measurements of any two consecutive numbers. An example of a ratio scale is weight because it contains a measurable absolute zero point.

T-test data analysis method

The T-Test method is used in testing the differences between means on independent variables. The two levels in the NOIR measurement scale are used (Interval and Ratio scale), since their give differences between two measurable variables. When using a t-test the research aims at stating the degree of confidence that an obtained difference between the means of an investigated group is a bit bigger. The differences observed between variables are either by a chance or they exist in the target group. T-test is for small sample sizes between 14 and 40, and the test population has a normal distribution since it uses a simple random testing. In addition, t-test is used for independent samples, and each population must not be below 10 times larger than its sample.

In using a t-test, the research starts by formulating a hypothesis about the difference between variables. In the hypothesis, there is a statement on whether the two means are equal or the difference between them is zero, this is the null hypothesis (H0). The null hypothesis represents what is expected from the research. The second type of hypothesis is the alternative hypothesis represented by H1. The alternative hypothesis gives the real situation in the field. The following assumptions are made while using the t-test.

The data collected for independent variable have a continuous scale, example, scores of 1,2,3,4,5,

The sample are randomly sampled from the interest group,

The data collected has normal distribution, and

The standard deviation for two groups studied is close to equal.

The following steps are followed while conducting an analysis using the t-test method.

Step I: defining the null and alternative hypothesis,

Step II: calculating the t-statistics for the data

Step III: comparing the calculated t (tcalc) with the tabulated t-value (ttab). If tcalc > ttab, the null hypothesis is rejected and alternative hypothesis accepted. If ttab > tcalc we accept the nul;l hypothesis.

In a t-test, the number of respondents is represented by (n). The mean of of the sample is given the symbol INCLUDEPICTURE “http://archive.bio.ed.ac.uk/jdeacon/statistics/image152.gif” * MERGEFORMATINET .

Example of a t-test analysis:

The mean concentration measurement for 7 solutions was m = 4 ppm, and sample standard deviation s = 0.9 ppm.

Solution:

STEP I:

Null hypothesis H0 : μ = μ0

Alternative hypothesis HA : μ > μ0

μ0 = 2 ppm, which is the allowed limit

μ = is the population mean of the measured solution.

STEP II:

Calculating t-statistics using the formula;

INCLUDEPICTURE “http://www.chem.utoronto.ca/coursenotes/analsci/StatsTutorial/t1mean.gif” * MERGEFORMATINET

tcalc = {(4-2)/0.9/(2.65)} = 5.88

Where; s = sample standard deviation.

Step III:

The 95% confidence table gives the tabulated values of t for significant levels and degrees of freedom. From the table, tv = 6,95% = 1.94. Since tcalc > ttab, we reject the null hypothesis and conclude the sample mean is larger than the expected limit.

The use of correlation in a research with an example

Correlation shows the relationship between the strength and weaknesses of a pair of variables in a research. Correlation gets rid of some beliefs that some things should behave in a certain manner because of their status. For example, there is a relationship between height and weight. Some people belief that a tall person must be heavier than a shorter person, but this is not always the case. Various techniques are used to determine correlation in a research. The Pearson/ product-moment correlation deals with the relationship between two variables by removing the effect of one variable. On the other hand, correlation must work with quantitative data since it uses numbers to make references for variable relations. In addition, categorical data, like gender, are not used with correlation. Moreover, correlation follows the NOIR rating scale since it uses numbers and the sample data must have the same interval range. The result of correlation is called correlation coefficient (r) that ranges from -1.0 to +1.0. If r is close or equal to 0 the variables have no correlation, when r is positive, one variable one variable gets smaller while the other becomes larger. When r is negative, one variable gets larger as the other gets smaller.

Sample problem: find the correlation between work experience and income level for a group of 10 employees?

The results are shown in table 1 below.

Employee no. Experience in years Income in $000

1 0 20

2 5 30

3 5 40

4 10 30

5 10 50

6 15 50

7 20 60

8 25 50

9 30 70

10 35 60

Table 1: research results

Solution:

The results are tabulated on a scatter graph, where income is on the y axis and years of experience on X axis. Figure 2 shows the presentation on a scatter graph.

Figure 1: relationship between income and years of experience

The straight line shows an upward slope meaning there is a positive correlation. The more years of experience one has the higher the income level.