Home
  • Home
  • Biography
  • My Book
  • Knowledge base
  • PM Education
  • Talk to me
Subscribe
[Guide] Anomaly detection algorithm

[Guide] Anomaly detection algorithm

Image credit

https://unsplash.com/photos/klMii3cR9iI

Topic
DataCode
Type
Guide

Anomaly detection is a data-driven concept that is widely used by businesses across industries in order to identify potential anomalies in a performance of a product. The concept applies in multiple situations, from fraud detection, to performance monitoring, budget optimization etc.

For this demonstration, we'll be using a simple dataset and the Prophet algorithm. Its not a pure anomaly detection algorithm but it can serve this purpose too. We'll work with a dataset detailing prices for various fuel types. You can download it from this website (from the 1st table).

We'll be using a Collab notebook, so there's no need to download any additional tools to run the provided code.

#Download necessary libraries
!pip install pandas
!pip install matplotlib
!pip install prophet

#Load necessary libraries
from prophet import Prophet
# Initialize the model and set its sensitivity
model = Prophet(interval_width=0.95)

# Fit the model
model.fit(data)

# Forecast on the original data to get the bounds
forecast = model.predict(data)
#Calculate the anomalies plus the upper and lower bounds.
anomalies = data.loc[(data['y'] > forecast['yhat_upper']) | (data['y'] < forecast['yhat_lower'])]
#Visualize the results
import matplotlib.pyplot as plt

# Plot the Prophet forecast
fig1 = model.plot(forecast)

# Overlay the anomalies
plt.scatter(anomalies['ds'], anomalies['y'], color='red', s=50, label='Anomalies')
plt.legend()
plt.show()

#The red dotes are dates that are considered as anomalies.
image
#Print the data points that were flagged as anomalies
print(anomalies[['ds', 'y']])
							ds      y
1806  2022-02-22  1.621
1807  2022-02-23  1.623
1808  2022-02-24  1.626
1809  2022-02-25  1.635
1810  2022-02-26  1.640
...          ...    ...
1932  2022-06-28  2.123
1933  2022-06-29  2.120
1934  2022-06-30  2.112
1935  2022-07-01  2.102
1936  2022-07-02  2.094

Run the code on your own, adjust the interval_width and check the different results that will be generated. Also, try to expand the capabilities of Prophet or try other algorithms (like Luminaire) to get a better understanding of how anomaly detection works.

Relevant posts:

[Hot take] End of Software?[Hot take] End of Software?
[Hot take] End of Software?
June 1, 2024
Agile Marketing chatbotAgile Marketing chatbot
Agile Marketing chatbot
November 20, 2023
A Beginner's guide on how to use Tag Manager in GA4A Beginner's guide on how to use Tag Manager in GA4
A Beginner's guide on how to use Tag Manager in GA4
October 3, 2023
[Step by step] How to build a chatbot using openAI, Langchain & Streamlit.[Step by step] How to build a chatbot using openAI, Langchain & Streamlit.
[Step by step] How to build a chatbot using openAI, Langchain & Streamlit.
September 21, 2023
[Review] Langchain with Python bootcamp[Review] Langchain with Python bootcamp
[Review] Langchain with Python bootcamp
September 16, 2023
[Guide] Anomaly detection algorithm [Guide] Anomaly detection algorithm
[Guide] Anomaly detection algorithm
September 2, 2023
Awareness testAwareness test
Awareness test
September 1, 2023
[Review] LangChain for LLM Application Development[Review]
[Review] LangChain for LLM Application Development
August 27, 2023
Ask Great QuestionsAsk Great Questions
Ask Great Questions
August 22, 2023
User acquisition KPI #2User acquisition KPI #2
User acquisition KPI #2
August 21, 2023
User acquisition KPI #1User acquisition KPI #1
User acquisition KPI #1
August 18, 2023
[Review] AI worskhop for Senior PMs[Review] AI worskhop for Senior PMs
[Review] AI worskhop for Senior PMs
August 18, 2023
Flawed conversionsFlawed conversions
Flawed conversions
August 9, 2023
Coding or thinking?Coding or thinking?
Coding or thinking?
August 8, 2023
ChatGPT4 system card summaryChatGPT4 system card summary
ChatGPT4 system card summary
August 7, 2023
Prompting as a skillPrompting as a skill
Prompting as a skill
August 1, 2023
The most important user metric/featureThe most important user metric/feature
The most important user metric/feature
July 26, 2023
A hidden metric: StDevA hidden metric: StDev
A hidden metric: StDev
July 21, 2023
[Guide] What is a good conversion rate? [Guide]
[Guide] What is a good conversion rate?
November 18, 2020
[Essay] A strange A/B test[Essay] A strange A/B test
[Essay] A strange A/B test
December 1, 2018

Biography

Knowledge base

Buy my book

PM Education

Opinions expressed are solely my own and do not necessarily express the views or opinions of my employer. All rights reserved.

LinkedInRSS
# Loading the dataset into a pandas DataFrame
import pandas as pd

#Load the dataset
#save and upload the csv to the collab notebook. Then copy the path of csv and paste it here.
df = pd.read_csv('/content/your_saved_file.csv')

# Select and rename the relevant columns for Prophet
data = df[['index', 'diesel']].rename(columns={'index': 'ds', 'diesel': 'y'})

# Display the first few rows of the transformed dataset
data.head()