Notes: Machine Learning Scholarship Program for Microsoft Azure

Lesson 2. Introduction to Machine Learning

7. Tabular Data

  • A vector is simply an array of numbers, such as (1, 2, 3)—or a nested array that contains other arrays of numbers, such as (1, 2, (1, 2, 3)).

8. Scaling Data

  • Two common approaches to scaling data include standardization and normalization.
  • Standardization:
    • rescales data so that it has a mean of 0 and a standard deviation of 1
    • \((x - \mu)/\sigma\)
  • Normalization:
    • rescales the data into the range [0, 1]
    • \((x - x_{min}) / (x_{max} - x_{min})\)