๐ŸŽฌ
Data Science - Wintersemester 24/25
  • Welcome
  • Whatโ€™s Data Science and How Do I Do It?
    • ๐Ÿ“†Timeline
    • ๐Ÿดโ€โ˜ ๏ธR Overview
      • ๐Ÿ“ฉInstallation
      • ๐Ÿˆโ€โฌ›GitHub Setup
      • ๐Ÿฅ—DataCamp Courses
    • ๐ŸPython Overview
      • ๐Ÿ“ฉInstallation
      • ๐Ÿˆโ€โฌ›GitHub Setup
      • ๐Ÿ“ฆVirtual Environment Setup
      • ๐Ÿฅ—DataCamp Courses
  • Introduction to Your Project
    • About the Project Guide
    • What is this Project About?
  • Exploratory Data Analysis (EDA)
    • Getting started
    • Discovering the Data ๐Ÿ”Ž
      • Initial Exploration Tasks
      • Initial Data Visualization
    • Data Cleaning and Transformation
      • Cleaning the Crime Dataset๐Ÿ‘ฎ๐Ÿผ
      • Cleaning the Weather Dataset๐ŸŒฆ๏ธ
    • Data Visualization
      • Crime Rate Over Time
      • Crime Types
    • Grouping and Merging Data
    • Linear Regression
    • Impress us!
    • Internship Complete!
  • Advanced
    • Introduction
    • K-Means Clustering
      • The Clustering Model
      • Visualize the clusters
    • Impress us!
  • โœ…Exercise Checklist
  • Legal Disclaimer
Powered by GitBook
On this page
  1. Exploratory Data Analysis (EDA)
  2. Data Visualization

Crime Types

PreviousCrime Rate Over TimeNextGrouping and Merging Data

Last updated 5 months ago

Letโ€™s dive deeper and look into the different crime types. What do you think are the most common crime types? To answer this question, we want to plot the ten most common crime types in descending order.

We reduced the different crime types by dividing them into superordinate groups in order to get a better overview, but you don't have to do this. The result looks like this:

With these 10 most common crime types, let's look at the gender distribution for these crime types.

Your plots don't have to look exactly like this. Actually, we encourage you to realize your own ideas. Most of the times you will find a solution to your problem by using a search engine of your choice or websites like stackoverflow.com.

: geom_col() or geom_bar() will make a stacked barplot by default. But sometimes, you may want to have the bars side by side for ease of comparability. This is done by the position argument.

: When analyzing the frequency of categories, use nunique() to count the distinct types in a column, or value_counts() to rank them by occurrence. To visualize the top categories, use plot(kind='barh') for a horizontal bar chart, which works well for long category names. For grouped data, groupby() combined with size() can calculate category counts for each group, and unstack() pivots the data for better visualization. When customizing plots, specify colors explicitly with a dictionary (e.g., for gender categories), and ensure that less common categories are represented distinctly. Experiment with grid() and title() to improve readability and aesthetics!

๐Ÿดโ€โ˜ ๏ธ
๐Ÿ