Linear Regression

Now youโ€™re ready to see if hot weather really correlates with crime rates! ๐Ÿ”ฅ๐Ÿ‘ฎ

Visualisation

Correlation

But checking with the eyes is unprofessional! Let's check the correlation of the variables!

Simple Linear Regression

Now that we see, that there could be a correlation between temperature and crime count. Let's check with a simple linear regression!

Linear Regression

Simple linear regression just wouldn't be enough. Let's get some more independent variables in!

๐Ÿดโ€โ˜ ๏ธ: To create a smooth visualization of noisy data, apply a rolling mean to both the crime count and temperature before plotting. Use the zoo package's rollmean() function to calculate a 30-day rolling mean. Then, use ggplot() to create your plot and geom_line() to visualize both the crime count and temperature data. To ensure both datasets are properly aligned, use a scaling factor to adjust the temperature values. For the secondary y-axis (temperature), use sec_axis() to display the temperature in Celsius. Adjust the y-axis limits with scale_y_continuous() and apply colors using scale_color_manual() to make the lines distinguishable.

To explore the relationship between crime count and temperature, consider using the lm() function to perform a linear regression.

๐Ÿ: To create a smoother visualization of noisy data, apply a rolling mean to both the crime count and temperature using the .rolling() and .mean() methods before plotting.

You'll need a graph with a dual Y-axis: Use plt.subplots()to create a chart with user-defined dimensions.

Color and labels help visually distinguish the data lines, enhancing clarity and making the plot easier to interpret. ๐Ÿ˜„

Use twinx() to add a second Y-axis, allowing you to display two different data scales on the same graph and the .corr() method to determine the correlation between temperature and crime rate.

To explore the relationship between crime count and temperature, as well as multiple variables, consider using the OLS function from the statsmodels library to perform both simple and multiple linear regressions.

Last updated