This content originally appeared on DEV Community and was authored by vinicius fagundes
In Part 1 and Part 2, we created a pipeline to crawl smartphone prices and load them into Snowflake. Now, weβll show how your Data Science team can use that data to:
- Recommend products by price
- Suggest the best day to buy
- Predict future price drops
Letβs walk through a prototype using pandas
, scikit-learn
, and matplotlib
.
Data Snapshot
Weβll use a dataframe like the one below, pulled from the Snowflake PRICING.ECOM_PRICE_INDEX
table:
import pandas as pd
df = pd.read_sql("SELECT * FROM PRICING.ECOM_PRICE_INDEX", connection)
df.head()
product_name | product_price | scraped_at | ingestion_date |
---|---|---|---|
Galaxy S23 Ultra | 1199.99 | 2024-06-01 08:00:00 | 2024-06-01 |
Galaxy S23 Ultra | 1149.99 | 2024-06-05 08:00:00 | 2024-06-05 |
iPhone 14 Pro Max | 999.99 | 2024-06-01 08:00:00 | 2024-06-01 |
iPhone 14 Pro Max | 979.99 | 2024-06-10 08:00:00 | 2024-06-10 |
1. Recommend Lowest Prices by Product
recommendations = df.sort_values('scraped_at') .groupby('product_name') .agg(best_price=('product_price', 'min'),
last_price=('product_price', 'last'),
price_diff=('product_price', lambda x: x.iloc[-1] - x.min()))
2. Find the Best Day to Buy
df['weekday'] = pd.to_datetime(df['scraped_at']).dt.day_name()
best_days = df.groupby(['product_name', 'weekday'])['product_price'].mean().reset_index()
best_day_to_buy = best_days.loc[best_days.groupby('product_name')['product_price'].idxmin()]
3. Predict Future Price Drops
from sklearn.linear_model import LinearRegression
import numpy as np
df['day'] = (pd.to_datetime(df['scraped_at']) - pd.to_datetime(df['scraped_at']).min()).dt.days
predictions = []
for product in df['product_name'].unique():
subset = df[df['product_name'] == product].sort_values('day')
X = subset[['day']]
y = subset['product_price']
if len(X) < 3:
continue
model = LinearRegression().fit(X, y)
next_day = np.array([[X['day'].max() + 1]])
predicted_price = model.predict(next_day)[0]
predictions.append({
'product_name': product,
'predicted_price_next_day': predicted_price,
'trend': 'dropping' if predicted_price < y.iloc[-1] else 'rising'
})
predicted_df = pd.DataFrame(predictions)
Example Output
product_name | predicted_price_next_day | trend |
---|---|---|
Galaxy S23 Ultra | 1129.45 | dropping |
iPhone 14 Pro Max | 982.10 | rising |
Final Note
Itβs tempting to jump into advanced AI models β but for many pricing problems, basic data analysis is more than enough.
Simple statistics like averages and min/max reveal buyer trends
Time-based aggregation gives powerful forecasting signals
ML can add predictive power, but isnβt always necessary
In many cases, using clean pipelines and descriptive analytics leads to faster insights and more trustworthy decisions than a complex AI model.
Start with the data. Understand it. Then decide whether AI is worth the extra complexity.
This content originally appeared on DEV Community and was authored by vinicius fagundes