Data Analysis Fundamentals

Creating New Columns

The dataset has Revenue and Expenses, but not Profit. Most analysis questions need data that isn’t in the original file.


The companies dataset has Revenue and Expenses, but no Profit column. Most analysis questions require data that isn’t in the original file. It has to be computed.

Multiply two existing columns to create a new one. Pandas applies the expression row by row automatically:

Python
Output

The pattern is always df['NewCol'] = expression. Pandas applies it to every row without needing a loop.


What will be the output?

Python

Comparisons create True/False columns, which is useful for flagging rows that meet a condition:

Python
Output

A boolean column marks each row as passing or failing a test. Later, it can be used to filter the DataFrame.


What will be the output?

Python

Chain operations to compute percentages. Here, a profit margin from Revenue and Cost:

Python
Output

Derived columns like profit margins, growth rates, and ratios turn raw numbers into metrics that actually answer questions.


What will be the output?

Python

Existing columns can be overwritten too: df['Price'] = df['Price'] * 1.1 increases every price by 10%.


What will be the output?

Python