๐ŸŽฌ
Data Science - Wintersemester 24/25
  • Welcome
  • Whatโ€™s Data Science and How Do I Do It?
    • ๐Ÿ“†Timeline
    • ๐Ÿดโ€โ˜ ๏ธR Overview
      • ๐Ÿ“ฉInstallation
      • ๐Ÿˆโ€โฌ›GitHub Setup
      • ๐Ÿฅ—DataCamp Courses
    • ๐ŸPython Overview
      • ๐Ÿ“ฉInstallation
      • ๐Ÿˆโ€โฌ›GitHub Setup
      • ๐Ÿ“ฆVirtual Environment Setup
      • ๐Ÿฅ—DataCamp Courses
  • Introduction to Your Project
    • About the Project Guide
    • What is this Project About?
  • Exploratory Data Analysis (EDA)
    • Getting started
    • Discovering the Data ๐Ÿ”Ž
      • Initial Exploration Tasks
      • Initial Data Visualization
    • Data Cleaning and Transformation
      • Cleaning the Crime Dataset๐Ÿ‘ฎ๐Ÿผ
      • Cleaning the Weather Dataset๐ŸŒฆ๏ธ
    • Data Visualization
      • Crime Rate Over Time
      • Crime Types
    • Grouping and Merging Data
    • Linear Regression
    • Impress us!
    • Internship Complete!
  • Advanced
    • Introduction
    • K-Means Clustering
      • The Clustering Model
      • Visualize the clusters
    • Impress us!
  • โœ…Exercise Checklist
  • Legal Disclaimer
Powered by GitBook
On this page
  1. Exploratory Data Analysis (EDA)
  2. Data Visualization

Crime Rate Over Time

PreviousData VisualizationNextCrime Types

Last updated 3 months ago

Letโ€™s try some simple exploratory plots at first and look at the general trend of crime cases in general. Can we find interesting trends and/or relations of crimes cases?

The result should look something like this:

Line plot
Barplot

How does your plot look like? A plot doesnโ€™t have to be aesthetically pleasing, but has to convey your message without any further explanation. Another person should be able to get the quintessence of what you're trying to show by just seeing your plot. So let's add some more information.

After this step, you should expect the plot to look something like this:

๐Ÿดโ€โ˜ ๏ธ: In order to get the number of crimes per month, use group_by() to group your data by month before summing over each day.

The plots above are made with Python, if you're using R and your plots don't look identical don't worry!

Does this plot help you explain what you wanted to explain? The plot seems to be very "noisy" which makes it hard to find a general trend. Letโ€™s smooth the data to make trends better visible. Here we will plot a (a.k.a. moving average)

The zoo package has a function to calculate the rolling mean. You can find more about the rolling mean and more about the zoo package .

: To analyze time-series data, resample() is a powerful tool that adjusts the frequency of your data. Use it with datetime columns to group data into specific intervals (e.g., daily, monthly) and apply aggregation functions like sum(), mean(), or size(). Unlike groupby(), which requires explicit grouping columns, resample() works directly with time-based data, making it ideal for tasks like counting events per day or calculating averages over weeks. Combine it with methods like rolling() for smoothing trends and unstack() for pivoting multi-level indexes to prepare your data for visualization. Check the pandas documentation for more examples!

๐Ÿ
rolling mean
here
here