anas sudijono pengantar statistik pendidikan pdf. Quote. Postby Just» Sat Mar 2, am. Looking for anas sudijono pengantar statistik pendidikan pdf. Tugas individu menelaah kajian analisis statistika yang digunakan pada artikel Mata Kuliah Statistika pada Program Studi Magister Pendidikan Bahasa Indonesia dsb) + artikel pencacahan (data > Nisbah/Nol Mutlak + buku-buku diskret). Download as DOC, PDF, TXT or read online from Scribd. Flag for . Mampu mengaplikasikan statistik pendidikan dalam bidang p[enelitian dan penilaian ( evaluasi) pendidikan. 6. Literatur: BUKU RUJUKAN / LITERATUR a. Pengantar.

Author: | CLAUDETTE WICKSTROM |

Language: | English, Spanish, Dutch |

Country: | Sweden |

Genre: | Politics & Laws |

Pages: | 222 |

Published (Last): | 18.04.2016 |

ISBN: | 603-7-50234-909-6 |

ePub File Size: | 15.42 MB |

PDF File Size: | 12.12 MB |

Distribution: | Free* [*Regsitration Required] |

Downloads: | 45102 |

Uploaded by: | MARJORY |

Statistik pendidikan: buku bahan ajar mata kuliah statistik / Bambang Budi Wiyono il. ; 28 cm eks. 1. STATISTIK PENDIDIKAN. I. Judul Download as PDF. Download as PDF Print Pengantar statistik pendidikan / Anas Sudijono Send to Email Pengantar statistik pendidikan / Anas Sudijono. Kementerian Riset, Teknologi, dan Pendidikan Tinggi Republik Indonesia. Alamat: Gedung D, Jalan Jenderal Sudirman Pintu Satu, Senayan.

Pengertian statistik pendidikan. Pustaka Pelajar They have the property of identity because they let us know whether we agree or disagree. What scale of measurement is this? Likert-type ratings have the properties of identity and order. Look for numbers that should not be there.

Memberikan contoh setiap tingkatan skala pengukuran. Memahami pengertian mean dan cara pengertian dan cara cara penghitungan penghitungannya.

Memahami pengertian distribusi frekuensi.

Memahami pengertian modus dan cara penghitungan berdasrkan data cara penghitungan penghitungannya. Mahasiswa memahami cara 5. Pembuatan grafik 1. Memahami pembauatn grafik sebagai pembuatan grafik sebagai alat Histogram dan alat penggambaran data. Memahami cara pembauatan tabel pembauatan tabel distribusi tabel distribusi distribusi frekuensi. Memahami pengertian median dan cara pengertian dan cara cara penghitungan penghitungannya.

Pengertian dan 3. Pengertian dan 1. Cara pembuatan 1. Memahami pengertian kuartil dan cara penghitungan berdasarkan data cara penghitungan penghitungannya. Menggambarkan grafik berasarkan data 5 V membaut grafik berdasarkan data alat yang tersedia yang terdapat dalam tabel penggambaran distribusi frekuewnsi.

Melakukan penghitungan mean. Mampu menyusun data mentah dalam data mentah dalam tabel tabel distribusi frekuensi.

Melakukan penghitungan median. Melakukan penghitungan kuartil. Melakukan penghitungan modus. Pengertian 1. Mahasiswa memahami 6. Mahasiswa memahami 9.

Memahami pengertian tabel distribusi 3 III dan tabel distribusi frekuensi. Memahami pengertian persentil dan cara penghitungan berdasarkan data cara penghitungan penghitungannya. Memahami teknik analisa korelasional. Melakukan penghitungan standar deviasi. Melakukan penghitungan deviasi. Penerapan product moment. Memahami teknik analisa komparasional analisa komparasional dan komparasional 13 XIII 2.

Memahami pengertian deviasi dan cara pengertian dan cara cara penghitungan penghitungannya. Penerapan 1. Mahasiswa memahami Melakukan penghitungan dan analisis data dengan menggunakan teknik korelasi tata jenjang. Melakukan penghitungan dan analisa data product moment. Teknik Analisa 1. Memahami penerapan teknik korelasi tata penerapan teknik korelasi tata teknik korelasi tata jenjang 12 XII jenjang. Memahami pengertian desil dan cara pengertian dan cara cara penghitungan penghitungannya.

Mahasiswa memahami teknik Melakukan penghitungan desil. Teknik analisa 1. Melakukan penghgitungan dan analisa penggunanya. Memahami pengertian standar deviasi penghitungan berdasarkan data dan cara penghitungannya. Melakukan penghitungan persentil. Anas Sudjono. Anto dajan. They allow us not only to rank order the items that are measured but also to quantify and compare the magnitudes of differences between them. Counts are interval scale measurements, such as counts of publications or citations, years of education, etc.

Ratio - scale Variables These are continuous positive measurements on a nonlinear scale. A typical example is the growth of bacterial population say, with a growth function AeBt. In this model, equal time intervals multiply the population by the same ratio. Hence, the name ratio - scale. Ratio data are also interval data, but they are not measured on a linear scale.

With interval data, one can perform logical operations, add, and subtract, but one cannot multiply or divide. For instance, if a liquid is at 40 degrees and we add 10 degrees, it will be 50 degrees. In social sciences, the issue of "true zero" rarely arises, but one should be aware of the statistical issues involved.

However this procedure should be avoided as it can distort the results. Attitude is a resultant of number of external and internal factors. Depending upon the attitude to be measured, appropriate scales are designed. Scaling is a technique used for measuring qualitative responses of respondents such as those related to their feelings, perception, likes, dislikes, interests and preferences.

Types of Scales Most frequently used Scales 1. Nominal Scale 2. Ordinal Scale 3. Interval Scale 4. Ratio Scale Self Rating Scales 1. Graphic Rating Scale 2. Itemized Rating Scales a. Likert Scale b. Semantic Differential Scale c.

Multi Dimensional Scaling e. Thurston Scales f. Nominal Scale This is a very simple scale. These scales are just numerical and are the least restrictive of all the scales. Instances of Nominal Scale are - credit card numbers, bank account numbers, employee id numbers etc. It is simple and widely used when relationship between two variables is to be studied. Following example illustrates - What is your gender? How do you stock items at present? Ordinal Scale Ordinal scales are the simplest attitude measuring scale used in Marketing Research.

It is more powerful than a nominal scale in that the numbers possess the property of rank order. Example 1: Rank the following attributes 1 - 5 , on their importance in a microwave oven. Company Name b. Price d. Comfort e. Design The most important attribute is ranked 1 by the respondents and the least important is ranked 5. Instead of numbers, letters or symbols too can be used to rate in a ordinal scale. Such scale makes no attempt to measure the degree of favourability of different rankings.

Median and mode are meaningful for ordinal scale. Interval Scale Herein the distance between the various categories unlike in Nominal, or numbers unlike in Ordinal, are equal in case of Interval Scales. The Interval Scales are also termed as Rating Scales. An Interval Scale has an arbitrary Zero point with further numbers placed at equal intervals. A very good example of Interval Scale is a Thermometer. Illustration 1 - How do you rate your present refrigerator for the following qualities.

Such a scale however does not permit conclusion that position 4 is twice as strong as position 2 because no zero position has been established. The data obtained from the Interval Scale can be used to calculate the Mean scores of each attributes over all respondents.

The Standard Deviation a measure of dispersion can also be calculated. In the above example of Interval scale, a score of 4 in one quality does not necessarily mean that the respondent is twice more satisfied than the respondent who marks 2 on the scale.

A Ratio scale has a natural zero point and further numbers are placed at equally appearing intervals. For example scales for measuring physical quantities like - length, weight, etc. The ratio scales are very common in physical scenarios. Quantified responses forming a ratio scale analytically are the most versatile. Rati scale possess all he characteristics of an internal scale, and the ratios of the numbers on these scales have meaningful interpretations. Data on certain demographic or descriptive attributes, if they are obtained through open-ended questions, will have ratio-scale properties.

Consider the following questions: Q 1 What is your annual income before taxes? Since starting point is not chosen arbitrarily, computing and interpreting ratio makes sense. Self rating scales 1. Graphic Rating Scale The respondents rate the objects by placing a mark at the appropriate position on a line that runs from one extreme of the criterion variable to another.

Example 0 1 5 7 poor quality bad quality neither good nor bad good quality BRAND 1 This is also known as continuous rating scale. The customer can occupy any position. Here one attribute is taken ex-quality of any brand of icecream. No other indication is there on the continuous scale. A range is provided. Its limitation is that coding and analysis will require substantial amount of time, since we first have to measure the physical distances on the scale for each respondent.

Itemized Rating Scales These scales are different from continuous rating scales. They have a number of brief descriptions associated with each category. They are widely used in Marketing Research. They essentially take the form of the multiple category questions. Others are - Thurston and Guttman. Likert Scale It was developed Rensis Likert. Here the respondents are asked to indicate a degree of agreement and disagreement with each of a series of statement.

Each scale item has 5 response categories ranging from strongly agree and strongly disagree. Each degree of agreement is given a numerical score and the respondents total score is computed by summing these scores.

This total score of respondent reveals the particular opinion of a person. Likert Scale are of ordinal type, they enable one to rank attitudes, but not to measure the difference between attitudes. They take about the same amount of efforts to create as Thurston scale and are considered more discriminating and reliable because of the larger range of responses typically given in Likert scale.

A typical Likert scale has 20 - 30 statements. While designing a good Likert Scale, first a large pool of statements relevant to the measurement of attitude has to be generated and then from the pool statements, the statements which are vague and non- discriminating have to be eliminated.

No judging gap is involved in this method. Semantic Differential Scale This is a seven point scale and the end points of the scale are associated with bipolar labels.

We have options- a. Individual can score between 1 to 7 or -3 to 3. On the basis of these responses profiles are made. We can analyse for two or three products and by joining these profiles we get profile analysis. It could take any shape depending on the number of variables. This scale helps to determine overall similarities and differences among objects. When Semantic Differential Scale is used to develop an image profile, it provides a good basis for comparing images of two or more items.

The big advantage of this scale is its simplicity, while producing results compared with those of the more complex scaling methods. The method is easy and fast to administer, but it is also sensitive to small differences in attitude, highly versatile, reliable and generally valid. This scale has some distinctive features: Each item has ten response categories. Each item has an even number of categories. The response categories have numerical labels but no verbal labels.

Select a plus number for words which best describe the ice cream accurately. Select a minus number for words you think do not describe the ice cream quality accurately. This scale is usually presented vertically. Multi Dimensional Scaling It consists of a group of analytical techniques which are used to study consumer attitudes related to perceptions and preferences.

It is used to study- a. The major attributes of a given class of products perceivedby the consumers in considering the product and by which they compare the different ranks. To study which brand competes most directly with each other. To find out whether the consumers would like a new brand with a combination of characteristics not found in the market.

What would be the consumers ideal combination of product attributes. What sales and advertising messages are compatible with consumers brand perceptions. It is a computer based technique. The respondents are asked to place the various brands into different groups like similar, very similar, not similar, and so on.

A goodness of fit is traded off on a large number of attributes.

Then a lack of fit index is calculated by computer program. The purpose is to find a reasonably small number of dimensions which will eliminate most of the stress. This scaling involves a unrealistic assumption that a consumer who compares different brands would perceive the differences on the basis of only one attribute.

For example, what are the attributes for joining M.

Com course. There are a number of attributes, you can not base decision on one attribute only. Therefore, when the consumers are choosing between brands, they base their decision on various attributes. In practice, the perceptions of the consumers involve different attributes and any one consumer perceives each brand as a composite of a number of different attributes. This is a shortcoming of this scale. Whenever we choose from a number of alternatives, go for multi- dimensional scaling.

There are many possible uses of such scaling like in market segmentation, product life cycle, vendor evaluations and advertising media selection. The limitation of this scale is that it is difficult to clearly define the concept of similarities and preferences. Further the distances between the items are seen as different e. Thurston Scales These are also known as equal appearing interval scales.

They are used to measure the attitude towards a given concept or construct. For this purpose a large number of statements are collected that relate to the concept or construct being measured.

The judges rate these statements along an 11 category scale in which each category expresses a different degree of favourableness towards the concept.

The items are then ranked according to the mean or median ratings assigned by the judges and are used to construct questionnaire of twenty to thirty items that are chosen more or less evenly across the range of ratings. The statements are worded in such a way so that a person can agree or disagree with them.

The scale is then administered to assemble of respondents whose scores are determined by computing the mean or median value of the items agreed with. A person who disagrees with all the items has a score of zero.

But it is the time consuming method and labour intensive.

They are commonly used in psychology and education research. For example - Children should not be allowed to watch indecent programmes or government should ban these programmes or they are not allowed to air on the television. They all are related to one aspect. In this scale each score represents a unique set of responses and therefore the total score of every individual is obtained. This scale takes a lot of time and effort in development. They are very commonly used in political science, anthropology, public opinion, research and psychology.

The Q Sort technique It is used to discriminate among large number of objects quickly. It uses a rank order procedure and the objects are sorted into piles based on similarity with respect to some criteria. The number of objects to be sorted should be between approximately.

For example, here we are taking nine brands. On the basis of taste we classify the brands into tasty, moderate and non tasty.

We can classify on the basis of price also-Low, medium, high. Then we can attain the perception of people that whether they prefer low priced brand, high or moderate. We can classify sixty brands or pile it into three piles.

So the number of objects is to be placed in three piles-low, medium or high. Thus, the Q-sort technique is an attempt to classify subjects in terms of their similarity to attribute under study. The tutorial begins with a presentation of the properties of the abstract number system that are used to identify different scales of measurement.

It is followed by examples drawn from everyday life and psychological research. We recommend that you begin with everyday examples, as they may be more familiar to you, and then proceed to examples drawn from psychological research.

Once you understand the different scales of measurement and their properties, you should test your knowledge by doing the practice exercises.

The tutorial ends with a summary of all of the scales of measurement, their properties, and the descriptive and inferential statistics that may be used with each. Why Is This Important? The mathematical properties of the numbers you are going to analyze are important because they determine which mathematical operations are allowed.

This, in turn, determines which statistics you can use with those numbers. Properties of the Abstract Number System The process of measurement involves assigning numbers to observations according to rules. The way that the numbers are assigned determines the scale of measurement. Four scales of measurement are typically discussed in psychological statistics. The mathematical properties of the http: Now let's try some examples.

Select either examples of levels of measurement you may encounter in everyday life or those typically found in psychological research. Properties of the Abstract Number System - Everyday Examples We can describe the scales of measurement used in everyday examples in terms of their abstract number properties see main page.

Presented below are a series of everyday examples of each of the four primary scales of measurement. Once you have progressed through each of these examples, you will explore examples drawn from psychological research. Numbers are assigned to categories as "names". Which number is assigned to which category is completely arbitrary. Therefore, the only number property of the nominal scale of measurement is identity.

The number gives us the identity of the category assigned. The only mathematical operation we can perform with nominal data is to count. Classifying people according to gender is a common application of a nominal scale. In the example below, the number "1" is assigned to "male" and the number "2" is assigned to "female". We can just as easily assign the number "1" to "female" and "2" to male.

The purpose of the number is merely to name the characteristic or give it "identity". As we can see from the graphs, changing the number assigned to "male" and "female" does not have any impact on the data -- we still have the same number of men and women in the data set.

Additional examples for everyday nominal scales are zip codes, area of country. Ordinal Ordinal scales have the property of magnitude as well as identity. The numbers represent a quality being measured identity and can tell us whether a case has more of the quality measured or less of the quality measured than another case magnitude.

The distance between scale points is not equal. Ranked preferences are presented as an example of ordinal scales encountered http: We also address the concept of unequal distance between scale points. Ranked Preferences We are often interested in preferences for different tastes, especially if we are planning a party.

Let's say that we asked the three students pictured below to rank their preferences for four different sodas. We usually rank our strongest preference as "1". With four sodas, our lowest preference would be "4".

For each soda, we assign a rank that tells us the order magnitude of the preference for that particular soda identity. The number simply tells us that we prefer one soda over another, not "how much" more we prefer the soda. Because of the property of magnitude or order , the numbers are no longer considered arbitrary as they are in nominal scales.

If you asked students their preferences because you wanted to serve what they like best at a party, you would serve our first student Pepsi, our second student Sprite, and our third student Surge. Let's change the numbers assigned to "Pepsi" and "Coke" for our first student.

Changing the numbers changes the meaning of the preferences. You would now serve our first student Coke and not Pepsi. Distance between Scale Points We assume that the intervals between scale points on ordinal scales are unequal.

Thus, the "distance" between a rank of "1" and "2" is not necessarily the same as the "distance" between ranks of "3" and "4". She thinks Sprite is OK but prefers cola drinks.

She really does not like Surge at all. In this case the preference "distance" between "3" and "4" is much greater than the preference "distance" between ranks "1" and "2" even though the numerical distance between them is the same. This concept of unequal psychological distance is pictured below. Other examples for everyday ordinal scales: Socioeconomic status, class rank, letter grade. Interval Interval scales have the properties of: So, we can always be confident that the meaning of the distance between 25 and 35 is the same as the distance between 65 and A good example of an interval scale is the measurement of temperature on Fahrenheit or Celsius scales.

The units on a thermometer represent equal volumes of mercury between each interval on the scale. The thermometer identifies for us how many units of mercury correspond to the temperature measured. Zero degrees on either scale is an arbitrary number and not a "true" zero. The zero point does not indicate an absence of temperature; it is an arbitrary point on the scale.

Other examples for everyday interval scales: Age 0 is culturally determined , SAT scores. Scales with an absolute zero and equal interval are considered ratio scales. Money is a good example of an everyday ratio scale of measurement. If we have no money in our pockets, we have absolutely no ability to purchase anything.

Other examples of everyday ratio scales: Household size, annual income.

Properties of the Abstract Number System Psychological Examples We can describe the scales of measurement used in psychological research in terms of their abstract number properties.

The measures that we use in psychology vary in how they assign numbers to psychological characteristics. All assign numbers, but the properties of the abstract number system that are represented by those numbers can vary widely.

The following pages have a series of examples drawn from psychological research that illustrate each of the four primary scales of measurement. Nominal Nominal scales are the lowest scales of measurement.

A typical example of a nominal variable in psychology is diagnosis. These numbers are merely arbitrary "codes" for easy record keeping for hospitals and insurance companies.

Let's look at three diagnoses -- Schizophrenia - Disorganized Type The graph below shows the prevalence of these disorders in a hospital population. Suppose the hospital administrator decided to change the numbers assigned to each diagnosis.

Let's see what happens to the distribution. Only the label changes. Other examples for other psychological research nominal scales: Behavior codes in naturalistic observations, drug type, brain regions. We will discuss three important concepts in this section. Ranked Preferences Physiological psychologists are often interested in preferences for different tastes. Let's say that you were asked to taste five different foods and rank your preference in order.

The foods are sweet, salty, bitter, sour, fatty. With five foods, our lowest preference would be "5". These ranks have the property of identity because they tell us http: They do not tell us "how much" more, just more or less. Because of the property of magnitude or order , the numbers are no longer considered arbitrary.

The numbers reflect a characteristic of the person -- taste preference. Let's change the numbers assigned to "Sweet" and "Fatty" and "Sour" for our first student. Changing the numbers changes the meaning -- the numbers now indicate a very different type of person -- rather than having a "sweet tooth" we would think of this student as having an unusual taste preference. Assigned Ranks Psychologists are often asked to rank the performance of group members.

For example, a graduate program has 10 applicants who qualify for a scholarship but can only fund 3 applicants. Faculty members are asked to rank the students using a set of criteria in order to select the scholarship winners from the applicant pool. Many colleges use class rank as an admissions criterion. Assigned ranks are classified as ordinal scales because they have the properties of identity and magnitude.

We know that one person has been identified as having more skill than another but we do not know exactly how much more skill. The student with the 3. Let's see what happens if we switch the numbers around. With this change, the number "6" no longer has the same meaning; we no longer know the relative class standing of any student in this group. The order of numbers must be constant in either ascending or descending order.

Distance Between Scale Points We assume that the intervals between scale points on ordinal scales are unequal. Let's say that for the ranked preferences, our second student liked fatty tastes the best but also has a strong liking for salty foods which he rated as "2". He likes sweet foods rated "3" but has a stronger preference for foods such as French fries and potato chips that are both fatty and salty.

He does not like sour foods rated "4" and really hates bitter tastes rated "5". In this case the preference "distance" between "3" and "4" is much greater than the preference "distance" between ranks "1" and "2". The psychological distance represented by the interval between numbers is not equal.

Even though there is a one- point distance between ranks 2 and 3 and ranks 5 and 6, the differences in GPAs are not equal. GPAs that are. Thus, we know that student 2 is one rank higher than student 3 and student 5 is one rank higher than student 6 but the rank does not tell us how much higher they are in their GPA. So we can always be confident that the meaning of the distance between 7 and 10 is the same as the distance between 42 and Interval scales do not have a true zero point; the number "0" is arbitrary.

Many of our standardized tests in psychology use interval scales. An IQ Intelligence Quotient score from a standardized test of intelligence is a good example of an interval scale score. IQ scores are derived from a lengthy testing process that requires the participant to complete a number of cognitive tasks. Each task is scored and the set of scores is converted into an overall standardized IQ score.

IQ scores are created so that a score of represents the average IQ of the population and the standard deviation or average variability of scores is A distribution of IQ scores is presented below. If one student receives an IQ score of 84 and another student receives and IQ score of we can count on the units having the same meaning in order to http: The first student would be 16 points below the mean, which would indicate a below-average potential for educational pursuits.

The second student would be 16 points above the mean which would indicate an above-average potential for educational activities. There is no zero point for IQ. We do not think of a person as having no intelligence although we may be tempted to make that evaluation upon occasion. Similarly for standardized scales of personality or other psychological attributes -- a zero point is an arbitrary point on a scale and does not indicate the absence of a quality or characteristic.

You may need to read a test manual or detailed description of scoring procedures to determine whether a standardized test is measured on an interval scale. The interval scale of measurement only permits mathematical operations of addition and subtraction. We can combine amounts or remove amounts. We can discuss an amount that is more than, less than, or equal to another amount. But, we cannot make statements that involve multiplication or division. When measuring IQ, we can say that a person with an IQ score of is 40 points higher than someone with an IQ score of 70, but we would never say that an IQ of means that someone is twice as intelligent as someone with an IQ of A true zero point is required to make valid statements about mathematical operations of multiplication or division of numbers on a scale.

Other examples of interval scales in psychological research: Ratio scales are often applied in psychological research when we are measuring physical characteristics or dimensions such as weight, time http: Because ratio scales permit all possible mathematical operations, they are sometimes preferred by scientists.

Scales with an absolute zero and equal interval are considered ratio scales of measurement. Now, let's compare this count of the number of behaviors to our interval scale. We would never say that someone had no IQ. Yet, we can confidently discuss how many more times a particular behavior occurred. This is the advantage of ratio scales. It is the true zero point in ratio scales that allows us to multiply and divide.

Other examples of ratio scales in psychological research: Height, weight, volume, latency. Likert-type Ratings Likert-type ratings are used in surveys where we are asked to rate how much we agree or disagree with a statement.

Some psychologists classify Likert- type ratings as ordinal scales; others consider them to be roughly interval or approximately equal interval. Likert-type ratings have the properties of identity and order. They have the property of identity because they let us know whether we agree or disagree. They also have the property of order because each number represents a rating that is more or less than the others.

Psychologists disagree as to whether the interval between scale points is equal or not equal. Listed below is an item from a survey about relationships. We are asked to rate how much the item describes how we feel in our important relationships. It is easy for me to become emotionally close to others. I am comfortable depending on them and having them depend on me.

I don't worry about being alone or having others not accept me. We would consider this a high ranking or "strong" agreement with the statement. On this scale, the number "6" no longer has the meaning of strongly agreeing with the statement.

In fact, it does not have a clear meaning at all. Psychologists who classify Likert-type ratings as ordinal consider the distance between scale points to be unequal. Research suggests that our willingness to endorse positive statements is different from our willingness to endorse negative statements. We are also less likely to endorse statements at the extreme ends e.

Thus, the psychological distance represented by the interval between numbers is not equal. They argue that when scales are carefully constructed to have a sufficient number of scale points and appropriate labels, we can assume that the psychological distance between scale points is equal. But, because we can never be absolutely certain that the scale points represent equal psychological distance, we call the scales roughly or approximately interval to identify that they are treated as such.

So, do Likert-type ratings have an ordinal or approximately interval scale of measurement? This is up to you, as a researcher and data analyst, to decide. Many psychologists create measures by adding up the individual Likert-type ratings or calculating an average rating across items. As some psychologists consider Likert ratings to be ordinal and others classify them as approximately interval, what is the scale of measurement of scores created from a set of individual Likert-type ratings?

Most psychologists consider such scores to be roughly or approximately equal interval.

The term "approximately interval" denotes that the scales are not interval but are treated as such in data analysis. Suppose that a self-esteem scale asks people to rate how much they agree or disagree with a series of items that ask about positive or negative personal qualities. Scoring instructions state that responses should be averaged across all items. The Likert-type ratings for each item can have a score of 1, 2, 3, or 4.

A distribution of scores is presented below. Once the items are averaged, the overall mean self-esteem score can have values such as 1. The distribution of the average across items is presented below. Creating an average across multiple items that measure the same underlying construct self-esteem is thought to increase the reliability and validity of the scale.

The numeric average of these items gives us a total score with a much wider range of possible values than just 1 to 4. Similarly, a sum or total of item responses will give a broader range of scores than the individual Likert-type ratings.

For this reason, many psychologists and statisticians treat scales created in this fashion as interval scales even if they view the individual Likert-type ratings as ordinal scales. Summary Why does the scale of measurement matter? The scale of measurement of our variables determines the mathematical operations that are permitted for those variables. In turn, these mathematical operations determine which statistics can be applied to the data.

The chart below lists the scales of measurement that we have reviewed in this exercise and the types of statistics that can be applied to variables created using these scales of measurement. The tutorial begins with a set of questions you should ask when selecting your test.

It is followed by demonstrations of the factors that are important to consider when choosing your statistic. Once you understand how to choose among statistics, you should test your knowledge by doing the practice exercises. This tutorial assumes that you know all of the basic univariate statistics. The tutorial is not designed to teach you the formulas and procedures for conducting these tests. Rather, it focuses on how to select the appropriate statistic to test a particular hypothesis.

Why This Is Important? Now that you know all of the basic univariate statistical tests, it can be difficult to figure out which one you need to use for your research design. This workshop will give you a broad overview of the different statistics available to help you choose which one is best for your study. Presented below are four questions you should ask and answer when trying to determine which statistical procedure is most appropriate to test your hypothesis.

To determine which test should be used in any given circumstance, we need to consider the hypothesis that is being tested, the independent and dependent variables and their scale of measurement, the study design, and the assumptions of the test.

Independent and Dependent Variables http: Ekaputra G. Our dependent variable is always the phenomenon or behavior that we want to explain or predict. The independent variable represents a predictor or causal variable in the study.

In any antecedent-consequent relationship, the antecedent is the independent variable and the consequent is the dependent variable. Study It has been traditional for the man rather than the woman to receive the check when a couple dines out. A researcher wondered whether this would be true if the woman was clearly in charge, asking for the wine list, questioning the waiter about dishes on the menu, etc. A large random sample of restaurants was selected. One couple was used in all restaurants, but in half the man assumed the traditional in-charge role, and in the other half the woman was in charge.

At each restaurant, the couple recorded whether the check was presented to the man or to the woman. Test the research hypothesis that the check will be presented to the person showing in-charge behavior. Analysis The behavior that we are trying to explain is the presentation of the check. Did the wait staff give the check to the man or the woman? This would be the dependent variable in the study. The independent variable was manipulated as part of the experimental design.

The independent variable was who was in charge during the dinner. Now that we have identified the independent and dependent variables, our next step is to determine the scale of measurement. Scale of Measurement Once we have identified the independent and dependent variables, our next step in choosing a statistical test is to identify the scale of measurement of the variables.

All of the parametric tests that we have learned to date require an interval or ratio scale of measurement for the dependent variable. Many psychologists also apply parametric tests to variables with an approximately interval scale of measurement. It is your decision whether to consider http: If you are working with a dependent variable that has a nominal or ordinal scale of measurement, then you must choose a nonparametric statistic to test your hypothesis.

The scale of measurement of the dependent variable helps us to choose the broad category of statistical procedures appropriate for our hypothesis nonparametric vs.

The scale of measurement of the independent variable helps us to determine which statistical procedure within the broad category is appropriate. Nonparametric statistics are used when our data are measured on a nominal or ordinal scale of measurement. Chi-square statistics and their modifications e. All other nonparametric statistics are appropriate when data are measured on an ordinal scale of measurement. The nonparametric statistics that we learned in class are listed below.

In order to choose among these tests, we must next determine the number of samples in the study design and the relationship between samples. No distinctions are made among these scales of measurement, although all parametric statistics have assumptions that must be met before proceeding with statistical analysis. The parametric statistics that we learned in class are listed below. This is not a complete list -- we did not learn advanced statistical techniques. Once we have reviewed our options, the next step is to consider the study design.

Let us now return to our study example and determine the scale of measurement in our study. Designs for which one-sample tests e. There must be at least two sets of scores or two "samples" for any statistic that examines differences between groups e. One-Sample Tests It is sometimes difficult to determine from a study description how many samples of data were collected.

Typically when the description refers to one type of person from a larger population, the study design uses only one sample. An example would be if we wanted to know whether Emory students are like college students in general. Emory students are a specific sample from the larger population of college students. The specific test that we use now varies based on whether we have collected data on only one dependent variable or on both independent and dependent variables.

Dependent Variable With single samples and one dependent variable, the one-sample Z test, the one-sample t test, and the chi-square goodness-of-fit test are the only statistics that can be used. Students sometimes ask, "but don't you have population data too, so you have two sets of data? Data have to exist or else the population parameters are defined. But, the researcher does not collect these data, they already exist. For the chi-square goodness-of-fit test, you can also compare the sample against chance probabilities.

The statistics that we have learned to test hypotheses about association include: Studies that refer to specific subgroups in the population also collect two or more samples of data. Once you have determined that the design uses two or more samples or "groups", then you must determine how many samples or groups are in the design.

Studies that are limited to two groups use either the chi-square statistic, Mann-Whitney U, Wilcoxon T, independent means t test, or the dependent means t test. Dependent Means Dependent groups refer to some type of association or link in the research design between sets of scores. This usually occurs in one of three conditions -- repeated measures, linked selection, or matching.

Repeated measures designs collect data on subjects using the same measure on at least two http: This often occurs before and after a treatment or when the same research subjects are exposed to two different experimental conditions. When subjects are selected into the study because of natural "links or associations", we want to analyze the data together.

This would occur in studies of parent-infant interaction, romantic partners, siblings, or best friends. In a study of parents and their children, I would want my data to be associated with my son's, not some other child's.

Subject matching also produces dependent data. Suppose that an investigator wanted to control for socioeconomic differences in research subjects. She might measure socioeconomic status and then match on that variable.

The scores on the dependent variable would then be treated as a pair in the statistical test. All statistical procedures for dependent or correlated groups treat the data as linked, therefore it is very important that you correctly identify dependent groups designs. Independent Means When there is no subject overlap across groups, we define the groups as independent.

Tests of gender differences are a good example of independent groups. We cannot be both male and female at the same time; the groups are completely independent. If you want to determine whether samples are independent or not, ask yourself, "Can a person be in one group at the same time he or she is in another? At each restaurant, the couple recorded whether the check was presented by wait staff to the man or to the woman. Analysis In this study, the sample consists of the wait staff at randomly selected restaurants.

At each restaurant either the man or the woman assumed the in- charge role. Thus, experimental procedures resulted in two separated samples of subjects -- wait staff serving an "in-charge" man and wait staff serving an "in-charge" woman. Given that a couple was dining together at each restaurant, these are independent groups; one subject could not be assigned to both the "in-charge man" and the "in-charge woman" experimental groups.

Thus, the statistic selected must be one for independent groups. The scale of measurement of the dependent variable in this study is nominal check presented to man or woman , thus the appropriate statistic would be the chi-square test of independence.

This statistic allows us to determine whether the frequency of check presentation to the male or female diner varied by whether the male or female was assuming the in-charge role. Our last step in selecting the appropriate statistic for analysis is to determine whether the assumptions of the statistical procedure have been met. Assumptions The final factor that we need to consider is the set of assumptions of the test.

All parametric tests assume that the populations from which samples are drawn have specific characteristics and that samples are drawn under certain conditions.

These characteristics and conditions are expressed in the assumptions of the tests. Nonparametric tests make assumptions about http: Parametric Assumptions Listed below are the most frequently encountered assumptions for parametric tests. Statistical procedures are available for testing these assumptions. The Kolmogorov-Smirnov Test is used to determine how likely it is that a sample came from a population that is normally distributed.

The Levene test is used to test the assumption of equal variances. If we violate test assumptions, the statistic chosen cannot be applied. In this circumstance we have two options: We can use a data transformation 2. We can choose a nonparametric statistic If data transformations are selected, the transformation must correct the violated assumption.

If successful, the transformation is applied and the parametric statistic is used for data analysis. Analysis We selected the chi-square test of independence for data analyses because the dependent variable is measured on a nominal scale of measurement and we have two independent groups in our design in-charge man and in-charge woman. The chi-square test of independence is a nonparametric test, so we make no distributional assumptions about check presentation in the population.

The chi-square test of independence does require random sampling and independence of observation. Our study meets both of these assumptions, so we can proceed to data analysis.

We have reviewed the steps needed to correctly select the statistic needed to test our hypothesis. Proceed to the Practice Exercises to test your knowledge. Review of Tests We have learned a broad range of univariate inferential statistics.