Columns
Your task here is to drop unnecessary columns and rename the remaining ones for better consistency.
Why Use Consistent Naming Conventions?
In data science projects, consistent naming conventions are crucial for maintaining readable, maintainable code. When working with datasets, especially in collaborative environments, standardized column names help team members quickly understand your data structure without having to decipher inconsistent naming patterns.
What is SCREAMING_SNAKE_CASE?
SCREAMING_SNAKE_CASE is a naming convention where:
All letters are UPPERCASE
Words are separated by UNDERSCORES (_)
No spaces or special characters are used
For example:
"first name" becomes "FIRST_NAME"
"Annual Revenue ($)" becomes "ANNUAL_REVENUE"
"customerID" becomes "CUSTOMER_ID"
Why?
Industry Standard: This convention is commonly used in database systems and data warehouses, making your skills transferable.
Readability: The uppercase nature makes column names stand out in your code, easily distinguishing them from variables and functions.
Consistency: Enforces a uniform style across your dataset, eliminating confusion from mixed conventions.
Clarity in Code: When reading pandas operations like
df['CUSTOMER_AGE'], it's immediately clear you're referring to a column name rather than a calculated variable.
To remove a list of columns you can use
.drop(). You can get a list of the dataframe's columns using the.columnsproperty, which you can reassign with another list. You can use.rename()with a dictionary of old and new column name mappings (e.g.{"old_name":"NEW_NAME"}) or just directly reassign the.columnsproperty.The
.columnspropety is apandas.Seriesobject. You can do string operations on it with.str; to convert to uppercase you can use.str.upper().
Last updated
