Learning rate helps control the size of each step of Gradient Descent .

When choosing $\alpha$, try $..., 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, ...$ (3x increases)

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/ef61e4e2-5ebb-4855-80ce-e0be18a01b29/various_learning_rates.png