AI For Trading: Linear Regression (19)

线性回归

现在,让我们看看怎样用一个随机变量去预测另一个随机变量。
we'll cover the basics of regression as this forms the basis for several models that are used to analyze stock returns over time. If we want to estimate the price of a house, we may assume that home buyers are willing to pay more for a bigger house. all other things being equal.

file

Breusch-Pagan Test for Heteroscedasticity (revisited)

Now that we’ve covered regression, let’s look at the Breusch Pagan test in more depth. The Breusch-Pagan test is one of many tests for homoscedasticity/heteroscedasticity.
现在我们已经介绍了回归,让我们更深入地看看Breusch Pagan测试。 Breusch-Pagan测试是同方差性/异方差性的许多测试之一。

It takes the residuals from a regression, and checks if they are dependent upon the independent variables that we fed into the regression. (Note that we’ll explain residuals in a few videos within this lesson, so feel free to jump back here after you watch the video “Linear Regression”).
它从回归中获取残差,并检查它们是否依赖于我们输入回归的独立变量。请注意,我们将在本课程中解释几个视频中的残差,因此在观看视频“线性回归”后,请随时跳回此处)。

The test does this by performing a second regression of the residuals against the independent variables, and checking if the coefficients from that second regression are statistically significant (non-zero).
该测试通过对自变量执行残差的第二次回归,并检查来自该第二回归的系数是否具有统计显着性(非零)来进行此测试。

If the coefficients of this second regression are significant, then the residuals depend upon the independent variables.
如果该第二回归的系数是显着的,那么残差取决于独立变量。

If the residuals depend upon the independent variables, then it means that the variance of the data depends on the independent variables. In other words, the data is likely heteroscedastic. So if the p-value of the Breusch-Pagan test is ≤ 0.05, we can assume with a 95% confidence that the distribution is heteroscedastic (not homoscedastic).
如果残差取决于自变量,则意味着数据的方差取决于自变量。换一种说法,数据很可能是异方差的。因此,如果Breusch-Pagan检验的p值≤0.05,我们可以假设95%的置信度是分布是异方差的(不是同方差的)。

Breusch-Pagan Test in Python

In Python, we can use the statsmodels.stats.diagnostic.het_breuschpagan(resid, exog_het) function to test for heteroscedasticity. We input the residuals from the regression of the dependent variable against the independent variables. We also input the independent variables that may affect the variance of the data. The function outputs a p-value.

quize

file

Is the coefficient in the regression line above positive, negative, or zero?

A : Positive
B: Negative
C: Zero

答案:B

为者常成,行者常至