[tutorials & resource material arranged by topic Class Table

Statistics Vocabulary


        (Click on most images to see an enlargement.)

    For the source page for this vocabulary go to Statistics: Much Vocabulary and Images, Little Computation.

    For statistics resources go to MIDDLE GROUND, the statistics info page.



Vocabulary


Population, Sample, Data, Statistic
  • SAMPLE - a collection - 1, more than 1, or perhaps all (here, you would know this by emptying or filling the tank).
     
  • POPULATION - a collection of all the items under consideration (here, you would know this by emptying or filling the tank).
     
  • STATISTICAL SAMPLE - a collection of raw data - counts, lengths, durations of time, test scores, integers, real numbers, etc.
     
  • STATISTICAL POPULATION - a collection of all the raw data - counts, lengths, durations of time, test scores, integers, real numbers, etc.
    One usually does not examine the entire population. Population statistics are often given in a situation. You might be studing the statistical sample.
     
  • DATA or DATA POINT or RAW DATA - 1 piece of information -- as in the length of a fish, as in the number of fins on a fish, as in the color of a fish.
     
  • STATISTIC - the processed information -- as in the average length of all the fish in the sample.


Discrete or Continuous
  • CONTINUOUS - all numbers are used over the desired interval
    You might have data that are real numbers on a range 0 < x < 6: numbers like 4.387, .58, 5.55555... or you might have real numbers from -3 to 5.
    Examples:
    ·       the length of a fish, measured in meters with a meter stick or tape
    (real numbers)
    · a length of time, as in how many months till a birthday
    · a height,
    · a weight
    · the temperatures on Wednesday mornings in January
  • DISCRETE - only certain numbers are used over the desired interval
    You might have data that are integers: numbers like -3, 0, -10, 4, 6, 10, 11, or natural (counting) numbers: numbers like 1, 2, 3, 4, ...
    Examples:
    ·       the length of fish, measured to the nearest half inch
    (whole numbers and halves only)
    ·       a number of things, as in how many nickels there are in a jar,
    (whole numbers)
    ·       a shoe size, as in 8, 8 1/2, 9, 9 1/2
    ·       a grade in school, as in: 1st, 2nd, 3rd
    ·       a night's win (+) or loss (-) in dollar bets (integers)


Take a Sample
  • RANDOM SAMPLE - a sample in which each selection, each fish, each data point, is equally likely to be chosen
     
  • POPULATION SIZE - the number of things (like fish) or their data points (like length - either discrete or continuous) in the population under consideration.


Look at the Data
  • SAMPLE SIZE - NUMBER - the number of data points, pieces of data in a sample. Ususlly symbolized by "n."
     
  • ORDERED DATA - data which has been sorted by size, smallest first.

     
  • FREQUENCY - number of occurrences. It reports how many times each specific data point is found in the sample.
     
  • FREQUENCY DISTRIBUTION - a table of the frequency of each data point or interval

     
  • BAR GRAPH - a graphic way of displaying each data point and its frequency. It is used for discrete data. On the horizontal axis, the data points are placed on a number line. One the vertical axis frequencies are listed. For each data point, a bar goes from the horizontal axis to the height of the required frequency to display the information.

     
  • HISTOGRAM - a graphic way of displaying intervals of data points and their frequencys. It is used for continuous data. On the horizontal axis, the intervals for the data points are placed on a number line. One the vertical axis frequencies are listed. For each interval, a bar goes from the horizontal axis to the height of the required frequency to display the information. Note: Notice that the histogram has "fat" bars because each number in the interval must be accounted for, whereas a bargraph only displays the data points it needs.

     
  • STEM-AND-LEAF DIAGRAM - a graphic way of displaying each data point, its frequency, and its position in ordered data. It is used for discrete data. The units digit of each data point is used as a "leaf" and placed on the right off the "tree" of larger digits used as a kind of interval. Each "leaf" in the "stack of leaves" makes the "stack" longer as a frequency bar would.

     
  • CAPITOL SIGMA - - a math symbol meaning "add up the terms."
  • SAMPLE STATICS - numbers, obtained by looking at the sample data in different ways. It is:
            1. used to create a purely numeric picture of the sample,
            2. used to help create the graphic representations of the sample,
            3. used to stand in as a numeric description of the population because none is available,
            4. used in testing if things about the population the sample represents are true.
     
            All of these are sample statistics. Some are used to used to approximate/represent:
     
          1st:   CENTER - "a one number representation" of the entire sample:
    AVERAGE -- any of the following three sample statistics
     
    MEAN -- symbol: , read as "x bar" - arithmetic average. Formula:
     
    MODE -- most frequent score or data point
     
    MEDIAN -- symbol: , read as "x hat" --median, Q2, the 50th percentile, the middle data point when the data is ordered from lowest to highest

          2nd:   SPREAD - "how the data spreads out:
    RANGE - the spread of the data from the highest data point to the lowest data point, xmax - xmin
     
    INNER-QUARTILE RANGE - the spread of the middle 50% of the data, the difference between the 3rd quartile and the 1st quartile, Q3 - Q1, where Q3 is the 75th percentile and Q1 is the 25th percentile. See box-and-whisker with more info.
     
    VARIAVCE - the square of the standard deviation.
     
    STANDARD DEVIATION - the average spread of the data computed in the standard way. The formula is:
  • STAT SYMBOLS -
  • BOX & WHISKERS PLOT - a drawn to scale representation of how certain sample statistics spread across the sample.

    box-and-whisker with more info

     
  • SYMMETRIC - having a left to right (bilateral) symmetry: the mean or median is in the middle and the tails are the same length and "fattness."
     
  • Some DISTRIBUTIONS - how the data is spread out, what shape the frequency distribution or histogram has.


Theoretical vs Experimental & Descriptive vs Analytical Statistics
  • EXPERIMENTAL & DESCRIPTIVE STATISTICS - statistics where data is collected, analysed, depicted, and described
     
  • THEORETICAL STATISTICS - statistics where mathematics and common sense are used to examine a situation

     
  • TREE DIAGRAM - a paper and pencil way of figuring out the sample space of a multi-step experiment (see above)
          1st: All possible results of the first stage of the experiment are listed vertically on the far left.
          2nd: From each of the first stage events, branches are drawn to the right, to all possible events in the second stage of the experiment.
          This record-keeping continues until all possible outcomes of each stage of the experiment are listed w/branches drawn.
     
  • EVENT - a result, data point, outcome, of an experiment
     
  • SAMPLE SPACE - the set of all possible outcomes/results/events of an experiment

     
  • ANALYTICAL STATISTICS - statistics where judgements are made

     


Probability
  • PROBABILITY - a branch of mathemetics that studies populations, samples, experiments, hypothises. It includes experimental, analytical, and theoretical statistics.
     
  • PROBABILITY OF AN EVENT, probability of the data point, or set of numbers, or interval, being the number x, p(x), or being the set of numbers, p(a, b, ..., c), or the inteval, p(athe a number between 0 and 1 that compairs the number of times a specific outcome or event may happen in a situation to the number of possible outcomes in that situation.
          If the probability of the event is 0, the event does not happen.
          If the probability of the event is 1, the event happens.
     
  • EXPECTED VALUE - the mean, the sum of all frequencies divided by the number of possible outcomes
          , the arithmetic average, the sum of the numbers divided by the number of numbers.
         
       
    (the sum of all the data)
      =
     
       
    (number of events in the sample space)

     


More Vocabulary and Topics that Are Not Included on this Page
  • INDEPENDENT EVENT - uninfluenced, stand alone, the result of one stage or trial has no effect on another stage or trial.
        ex. Raw sample data - the number of heads when 3 coins are flipped.  The result of each flip, or trial, is not influenced by the other flips.
     
  • DEPENDENT EVENT - having an influence on other stages or trials, the result of one stage of the collection of raw data, has an effect on another stage.
        ex. Raw sample data - the names of a president and vice president of a club with 4 members.
               One officer must be picked at a time or you would not know which officer was which. For instance: There are 4 choices for president, but only 3 choices for vp. The result of the first stage influences the second stage.
     
  • WITH REPLACEMENT - restore the original conditions after each trial. A after a trial or stage, the setting is restored to the original setting before begining the next trial or stage.
        ex. Raw sample data - draw a card from a deck, replace the card in the deck, draw a card from the deck. The 2 draws are independent. The replacement made each draw have the same outcomes.
     
  • WITHOUT REPLACEMENT - do not restore the original conditions after each trial, use the new conditions.
        ex. Raw sample data - draw a card from a deck, draw a 2nd card from the deck.
     
  • ORDER COUNTS - raw data has a 1st, 2nd, 3rd. Ex. officers in a club.
     
  • ORDER DOESN'T COUNT - the order of the raw data does not matter. Ex. a committe (without a chair) is chosen. It doesn't matter how the members are listed.
     
  • FACTORIAL - symbol: n!, the product of a natural number and all the natural numbers less than it, n!=n(n-1)...2·1. See a use on: Number of Ways to Make An Ordered List
    Questions.       Swipe between the stars to see the answer.
    1. 0! is *1, by definition *
    2. 1! is *1 is 1*
    3. 2! is *2x1 is 2*
    4. 3! is *3x2x1 is 6*
    5. 4! is *4x3x2x1 is 24*
    6. 5! is *4x3x2x1 is 120*
    7. 6! is *5x4x3x2x1 is 720*
    8. 7! is *6x5x4x3x2x1 is 5040*
    9. 8! is *8x7x6x5x4x3x2x1 is 40320*
    10. 9! is *9x8x7x6x5x4x3x2x1 is 362880*
    11. 10! is *10x9x8x7x6x5x4x3x2x1 is 3628800*
    The numbers grow quickly. For example 6! is the number of ways 6 people could line up -- order counts.

  •  
  • COUNTING METHODS - Basic Probability & Counting Problems
     
  • PERMUTATIONS - Number of Ways to Make An Ordered List
     
  • COMBINATIONS - Number of Ways to Make A Group


The Binomial Distribution
  • BINOMIAL -- See MIDDLE GROUND - Brief Summary of A Binomial Distribution
     
  • SUCCESS - one of two possible outcomes of a binomial trial. The probability of success is p. Note: p = 1- q
     
  • FAILURE - one of two possible outcomes of a binomial trial. The probability of failure is q. Note: q = 1- p
     
  • DISTRIBUTION - the way the data in a population is centered and spreads out.
     


Thank Goodness for Probability Density Functions
  • DISTRIBUTION - the way the data is centered and spreads out.
    · the way the numbers in a situation are impacted by the function or rule.
    · Usually the distribution is written algebraically as in f(x) = ...
    ex. f(x) = sin(x), the function is the sine of a number

     
  • FUNCTION - a really dependable rule. It is usually written as f(x) where x is the variable, changeable, number
    ex. The area of a rectangle is always the product of its length and width:
              A(l,w) = l(w).
     
  • FREQUENCY - number of occurrences. It reports how many times each specific data point is found in the sample.
     
  • FREQUENCY DISTRIBUTION - a graph showing a table of the frequency of each data point or interval
     
  • RANDOM VARIABLE - a number which is equally likely to be chosen
     
  • INTERVAL - a range of the variable from a number (say a) to a higher number (say b), as in:
    from a to b, a < x < b
    from a to b including a and b, a < x < b
    from a to b including a but not b, a < x < b
    from a to b including b but not a, a < x < b
     
  • CONTINUOUS
     
  • AREA UNDER THE CURVE - the sum of all function values, f(x), for each x in the interval. See Statistics Lab 5 - Probabilities for details.

     
  • PROBABILITY DISTRIBUTION FUNCTION - a continuous function,
    f(x), of probabilities, such that the sum of the probabilities is 1 and:

     


Normal and Standard Normal Distributions
  • e - a constant approximately equal to 2.718281828454590, as defined below.

     
  • z - the variable used to indicate workis with the standard normal distribution having a mean of 0 and a standard deviation of 1.
     
  • PI - - the ratio of the circumference of a circle to its diameter, about 3.14159 or 22/7.
     
  • NORMAL DISTRIBUTIONS - or Gaussian distribution, a continuous probability distribution (so the area under the curve equals 1), where the mean, mode, median are all the same, so the data gathers about a center making a symmetric bell-shaped curve. Many data points -- heights of people, lengths of fish, errors measurements, standardized test scores have normal distributions.

     
  • STANDARD NORMAL DISTRIBUTION - a normal distribution having a mean of 0 and a standard deviation of 1. It is very useful in computing, and looking up, probabilities, comparing samples and populations, and analysis and hypothesis testing.

     
  • STANDARD NORMAL TABLE OF PERCENTS/PROBABILITIES - uses z-scores and their probabilities.

     
  • CUMULATIVE STANDARD NORMAL DISTRIBUTION - uses z-scores and their probabilities beginning with z=-3 and ending with z= 3 but, lists the sum of the probabilities from z = -3 to the desired z-score.

     
  • BELL-SHAPED CURVE - a normal distribution. It looks like a symmetric bell sitting on a table. The scores are piled in the center and trail off at the upper and lower range of variables.

     
  • WITHIN A SPECIFIC STANDARD DEVIATION OF THE MEAN - a range of scores centered about the mean and, in either direction, not farther on the number line than the specified number of standard deviations.
     
    ex. on the standard normal number line, "within 1 standard deviation of the mean" means, from -1 to 1, - 1 < z < 1, and includes about 68% of the scores
     
    ex. on the normal number line, "within 3 standard deviation of the mean" means, from -3 to 3, - 3 < x < 3, and includes about 99.7% of the scores.
     

     
  • CHEBYCHEV'S RULES -- for any distribution, the percent of scores within k standard deviations of the mean, k > 0, is 1/k2


Confidence Interval
  • CONFIDENCE INTERVAL - range of expected values, a range of score in which the population parameter is believed to be.
     
  • alpha, - "(1- degree of confidence)," the probability a score does not fall into the confidence interval
     
  • LEVEL OF CONFIDENCE - in the picture below P( - E < < + E) -- a probability, usually expressed as a percent, that states how sure one is about the decision made by the test.
        ex. Test with a 90% level of confidence that > 60, with -- probability a score is in the confidence interval containing the mean.

     
  • E, error -- the maximum distance from the mean that still places a score within the confidence interval . (See above image.)


Hypothesis Testing
Complete the computation by entering the values and pressing the buttons.
Enter negative two as "-2."
 

( - )

=
 

(/ )

 
so,

  • k-- constant, like 70, ex. = 70
     
  • HYPOTHESIS (no symbol) -- a theory or statement which may or may not be true
  • HYPOTHESIS TEST - a procedure by which a hypothesis (or statement) which one believes to be true is tested statistically against another hypothesis already in use.
     
  • NULL HYPOTHESIS - H0 -- read as "H 0" -- the null (original or beginning) hypothesis
     
  • ALTERNATE HYPOTHESIS - H1 -- read as "H 1" -- the alternate (new) hypothesis
     
  • ONE-TAIL TEST - used when the alternate hypothesis, H1, is k, where k is the null hypothesis mean.
     
  • TWO-TAIL TEST - used when the alternate hypothesis, H1, is > k or < k, where k is the null hypothesis mean.
     
  • Z-CRITICAL - zcritical -- the boundary value(s) which end the confidence interval.
     
  • Z-TEST -- the evidence used to accept or reject the null hypothesis, usually computed from sample data.

     
  • TYPE I ERROR - "When the null hypothesis is true, you reject the null hypothesis."
     
  • TYPE II ERROR - -- "When the null hypothesis is false, you do not reject it."

     
  • P VALUE - the probability a score is in the extreme of the test statistic.

     
Just vocabulary.

Send a Message to Page Author
 
Name:  
Email:  
Message Area:  
   



Class Table [MC,i. Home] [Table] [Words] Classes [this semester's schedule w/links] [Good Stuff -- free & valuable resources]
© 2018, Agnes Azzolino
www.mathnstuff.com/math/spoken/here/2class/90/fishvoc.htm