DATA COLLECTION TECHNIQUES

Download Report

Transcript DATA COLLECTION TECHNIQUES

DATA COLLECTION TECHNIQUES

GROUP MEMBERS:-

GYAN PRAKASH

 

Ram POOJA YADAV

WHAT IS DATA?

The term is Latin term meaning “ to give” or “ those that are given”. It is a plural word and datum is its singular form . Data are the values of qualitative and quantitative variables.

know: Some terms associate with data that we need to 1. Data point 2. Data set 3. variable 4. observation

Data Set

:-Collection of data is known as data set.

Data Point

:-A single observation is known as data point.

Variable

:-It is a quantity whose value varies.

Observations

:-It is the value assigned to a variable.

Types of variable 1.Quantitative

2.Qualitative

3.Dependent

4.Independent

5.Discrete

6.Continuous

  Quantitative Variables: These are the one who can accept only numerical values.

Qualitative Variables: quality.

These are the one who do not accept numerical values , but depend upon the Independent Variables: stimulus variable.

It is the one whose effect the experimenter is interested . It is also known as  Dependent Variable: Variable.

It is the one that varies according to the variation in the Independent y=a+bx

  Discrete Variables: it is one which can take only isolated values . It appears by finite jump in between.

For example number of rooms in a building will be either 4 or 5 or complete natural numbers.

Continuous Variables: It is the one which can manifest itself through conceivable fractional value within the range of possibilities.

for example height of students . These can also assume non natural values.

TYPES OF DATA

Types of Data Primary Data Secondary Data

Primary Data :-The data which is collected for the first time by the researcher is known as primary data . The primary data is in the shape of raw materials to which the different statistical techniques and methods and tools are applied to reach the final interpretations.

Secondary Data :-It is the data which is already collected by someone /some agency and the researcher uses it for his/her research in order to save time , effort and finance . At times , it may not be possible to collect the information by the researcher himself . The secondary data is in the form of finished product which is ready for analysis.

 Data collection usually takes place in an improvement process . It is an important part of the project and is formalized through Data Collection Plan that constitutes the following activities:-

1.

Agree on Goals Pre Collection Activity Target Data Selection of Data Collection Method

Collection Activity 1.

2.

Following of the plans associated with the selected data collection technique.

Analysis of data by graphing as far as possible.

 Data Collection , pre collection activity is one of the most crucial step in the process . After the pre collection activity is completed , Data Collection in the field by various methods which we will be talking about later on , can be done in a structured systematic and scientific way.

A formal data collection is necessary if it insures that data gathered are both are defined and accurate and that substituent decision based on arguments embodied in the findings are valid . The process of data collection provides a base line from which we decide target points where we improve ourselves.

METHODS OF COLLECTING DATA

Methods of Collecting Data Census Survey Sampling Secondary data sources

CENSUS  The word is of ‘Latin ’ origin . The census was a list that track of all adult males fit for military service.

However according to modern definition, A census is the procedure of systematically acquiring and recording information about every member of a given population. The modern census is essential to international comparisons of a kind of statistics and census collect data on many of attributes of the population note just how many people are these, although population estimates remain an important function of census.

 It is regularly occurring and official count of a particular population . Other common causes:     Housing Census Agriculture Census Business Census Traffic Census

  Advantages of collecting data through census: 1.Accurate 2.Detailed

Disadvantages of collecting data through census: 1.Costly

2. Time consuming 3. Constraint on geographical accessibility

SAMPLING

 A sampling is a Data Collection Methods that includes only part of the local population and in other words we can say that ,sampling is concerned with the selection of a subset of individuals from a statistical population to estimate the characteristics of whole population.

 The main aim of the investigator in drawing the sample is to reduce the unmanageable heterogeneous population to handy one : so that all types are equally represented in it . The inferences are reliable because it is mathematically proved sample mean, is the best estimate of population mean.

INTERMEDIATE STEPS DURING COLLECTION OF DATA THROUGH SAMPLING  Defining the population of concern.

 Specifying a sampling method.

 Implementing the sampling plans.

BROAD DIVISION OF SAMPLING 1.

Subjective or purposive sampling :-It is the one where the samples are drawn according to certain rules. Here the personal element and predetermined choice of the investigator comes into play. For example:- if one is to select 20% towns from a lot ,for studying their characteristics and generalization it over all towns ,it may for some reason or the other(nearer approachability) have weakness for some of them and may even to include them in their sample whether they represent the group or area.

2. Objective or probability sampling :- In case of objective or probability sampling , the sample from a given set are selected. Each and every item or an individual have equal chance to be selected and hence it is also known as probability sampling. Samples are drawn by chit or card method or with the help of random no. table of triplet. All items are placed on the chits and then the required no. of chits are taken out.

Demerits of random sampling:-

Sometimes the inferences drawn from random sampling may be unreliable because from one type of field there can occur more units while some other class of field there can occur less units and some classes may even remain unpresented. So in order to arrive at more correct . Inferences, it is advisable to divide the heterogeneous universe into homogeneous classes known as “STRATA”.

Division Of Random Sampling Random Sampling Simple random sampling Stratified random sampling

  Simple Random Sampling :-It does not include ‘strata’ or we can say that heterogeneous population that need to be observed is not subdivided into homogeneous group.

Stratified Random Sampling :-When the sampling is done after subdividing the given heterogeneous population to a few homogeneous group known as ‘strata’ then it is known as stratified random sampling.

SYSTEMATIC SAMPLING

 It relies on arranging the study population according to some ordering scheme and then selecting elements at regular intervals through that ordered list. Systematic sampling involves a random start and then proceeds with the selection of every Kth element onward when (K=population size/sample size) it is important that the starting point is not automatically the first in the list, but instead randomly chosen from the first to the Kth element.

As long as the starting point Randomized, systematic sampling is a type of Probability sampling. Within systematic sampling stratification can make it more efficient, if the variable by which the list is ordered is correlated with the variable of INTEREST.

For example :-Suppose we wish to sample people from a long street that starts in a poor street(H.No. 1) and ends in an expensive district (H.No. 1000). A random selection of addresses from the street could easily end up with too many from the high end and few many from the low end ( or vice-versa).

Leading to an unpresentive sample ,selecting every 10 th street ensures that the sample is spread evenly along the length of the street ,representing all of these districts equally.

  Drawback associated with periodicity in data.

Drawback associated with arrangement of samples.

CLUSTURE SAMPLING

It is more cost effective to select respondents in groups ( clustures) sampling is then done on a geographical base.

For example: if we are surveying households in a city, we might choose to select 100 blocks and interview every household. It can reduce Travel and Administration cost.

LINE INTERCEPT SAMPLING  It is the method of sampling elements in a region by an element is sampled if a chosen line segment intersect that element.

A STRATIFIED SAMPLING IS MORE EFFECTIVE WHEN THESE CONDITIONS ARE MET: 1. Variability within strata are minimized.

2. Variability between strata are maximized.

ADVATAGES OVER OTHER SAMPLING METHODS: 1.

Focuses on important subpopulation and ignores the irrelevant one.

2.

3.

Allows to use different sampling techniques for different subpopulation.

Improve accuracy.

 1.

Disadvantages over other sampling methods: Requires selection of relevant stratification variables which can be difficult.

2.

3.

It is not useful when no homogeneous forms are there.

Can be expensive to lmplement.

SURVEYS

A field of applied statistics, survey methodology studies the sampling of individual units from a population and associated Data Collection Techniques such as Questionnaire Construction and other methods for improving the accuracy in responses to the survey plan.

A single survey may be focus on different topics such as:  Preferential for candidate  Opinion   Behavior (smoking or alcoholing is good or bad) Factual information (income)

Survey Methodology Topics

  Identify and select potential sample Data collection from those who have to reach to   Evaluation and testing of questions.

Selection of mode for possessing questions and collecting responses.

 Training and supreising interviews.

Administering of a Survey  1.

For administering a survey the choice for the mode of administering of survey is affected by the following factors: Cost 2.

3.

4.

5.

Coverage of target population Flexibility in asking question Respondents willingness to the participant Response accurately

Some common modes of administering survey 1.

2.

3.

4.

5.

Telephone Mail Online Personal –in-home surveys Hybrids of the above

FACTORS THAT ARE TOGETHER MAKE THE SURVEY METHODS SUCCESSFUL ONE: 1.

2.

Modes of data collection in surveys for given class of population , means proper modes of questionnaire should be applied for different class of people to be surveyed.

Response Formats:- A survey contains a no. of questions that the respondent has to answer in a set format. A distinction is made between open- handed system and close -handed system.

An open handed system asks the respondent to formulate his/her own answer.

While a close- handed system provides respondent to pick a answer from a given number of options. The response option for a close- handed system should be mutually exclusive and exhaustive (detailed data).

The following ways have been recommended for non response reduction:  Advance Letter:- a short letter in advance is sent to inform the sampled respondents about the upcoming survey . The style of the letter should be made more personalized. First, it announces that a phone call will made or an interviewer wants to make an appointment to do the survey face to face. Second the research topic will be described .

 Language should be considered properly.  Interviewers effect.

Methods of representation of Data Methods of representation of data Tabular representation Graphical representation Diagrammatic representation

TABULAR REPRESENTATION

 The of data in the set of rows and columns is referred to as Tabular representation , To tabulate the data we first classify the given data on the basis on the basis of similarity . Classification provides the basis for tabulation.

    Systematic representation of the given data.

Easy identification of desired value.

Easy identification of trends in data.

Basis for decision making.

o o o Table Number The purpose is to identify a particular table ,it is to be used when in a given discussion we have more then one tabular representation.

Stub - The title given to rows is called a stub.

Caption - The title given to the columns is known as caption. It is also called Box-Head. There may be various sub caption under one caption.

o Body - The numerical Information Present in the set of rows and columns is called body of table.

o Footnote - This is an explanatory note which is to be written beneath the table and hence to have represent it. The purpose is to explain the omissions .

o Totals -Where needed total and sub-total for columns and rows can be given for utility of the presentation for the reader.

Do and Donts’ in Tabular representation  It should be neat and simple.

 It should avoid the use of abbreviations, if used they should be explained in the footnote.

 It should have a self explanatory title.

 Caption, Sub caption and stubs should all be clear and brief.

 All numerical values should be rounded off a common decimal place.

 Figures which we need to be emphasized should be put between two thick lines or in a box.

 For missing values N.A.(NOT APPLICABLE) should be used written and it should be explained in the footnote.

STUB SUB ENTRIE S FOOT NOTE TABLE NO. TITLE CAPTION SUB-CAPTION BODY TOTAL

 There is a very old saying “A picture is worth 10,000 words” and it is very true and in statistics we desire and try to achieve same effect by the use of visual representation or data through diagram and graph.

 It eliminates dullness of numerical information.

 It makes the data comparison easy.

 It helps in locating the values of statistical measures like Mean , Median and Mode.

    It is prepared in a two graphic axis x’ and y’.

A diagram must be given a proper title preferably on the top.

To the extent possible ,diagram should be as informative as possible.

In diagram generally y-axis is not broken i.e. dependent variable values should start from zero.

TYPES OF DIAGRAM

1-D Type of diagram 2-D 3-D

o o o 1-Dimensional :- Only length is used to represent a given quantity like Bar Diagram.

2-Dimensional :- Length and breadth together represents the desired quantities. Since area of diagram are in proportion to their values and hence 2-dimensional diagram are also known as areal diagram. Like Circle, Square.

3-dimensional :- Length, breadth, height all becomes significant and as a result volume of these diagrams represents the given data. Like Cubes, Cuboids, Cylinder.

Types of 1-dimensional diagram

 Simple Bar Diagram  Multiple Bar Diagram  Component Bar Graph  Deviation Bar Graph  Broken Bar Diagram

SIMPLE BAR DIAGRAM In this type of bar diagram length or height of a bar is significant and is proportionate to the quantity that it represents. The width of the bar is insignificant however it is kept constant. The gap between the two bars is also kept constant.

This type of graph is useful when we have to represent different values over a given interval of time.

Data representable through simple bar graph SALE OF A MNC IN LAKHS YEAR 1983-84 1984-85 1985-86 1986-87 SALE 153 192 234 338

  In this diagram, for a particular case on x-axis , we have more than one bar. The bars are drawn adjoining each other. These bars either represents various components of an aggregate or various different variables,(with the same unit of measurement).

The width remains same and the gap between the two bars is kept constant.

Data corresponding to a multiple bar diagram

Year Department

arts science commerce 1985 20 10 12 1986 25 12 34 1987 40 35 23 1988 45 35 52

SUBDIVIDED BAR GRAPH

 When the purpose is to represent the various components of different aggregate values of a variable for different time period then we get subdivided bar graph.

 In this first bar graph is drawn for a particular case representing the aggregate and it is then broken in to various components . The width is kept constant.

Data corresponding to subdivided bar graph DEPARTMENT

Year Arts Comm. Science Total 1985 300 200 100 600 1986 250 250 200 700 1987 300 200 300 800

Percentage Subdivided Bar graph   It is a subdivided or a component bar diagram drawn on a % basis. It is drawn when the purpose is to represent relative changes.

in this all the bars go to the same length up to 100%, and then each bar is subdivided into proportion to the relative contribution to the respective aggregate.

 For example the profit or the losses . The diagram is used when we have to represent net quantities for example the net profit or the net losses . The positive values are shown above the x-axis and negative value below the x-axis.

Broken bar diagram  It is a special class of simple bar diagrams . It is useful when one or few values are much larger than the rest. In such cases for more logical and visually appealing presentation we break the particular extreme values bar to adjust them in to the given same space.

GRAPHICAL REPRESENTATION

IMPORTANCE:    It makes Data comparison more easier.

Helps in establishing trends in the data and hence helps in forecasting.

Helps in finding positional averages like mean, median, mode.

 Graphs of statistical data can be broadly divided into the following two types: (A) Graph of Time Series 1. one dependent variable histogram 2. more then one dependent variable histogram.

3. mixed graph 4. range graph

(B) Graphs of frequency distribution: 1. graph using simple frequency 2. graph using cumulative frequency

THANKYOU