The question is fairly simple. How do I return the minimum purchase date for each customer using Tidier?
using Tidier, DataFrames, Plots, CSV
#params
f = "path"
df = CSV.File(f) |> DataFrame
df = @chain df begin
@select(SHOPIFY_ORDER_ID, CUSTOMER_ID, SHIPMONTH, GROSS_REVENUE, Country)
@rename(order_id = SHOPIFY_ORDER_ID,
customer_id = CUSTOMER_ID,
date = SHIPMONTH,
revenue = GROSS_REVENUE,
country = Country)
@filter(country != "CA")
@filter(!ismissing(date))
@filter(revenue != 0.0)
end
# logic to calculate summary stats
df_sum = @chain df begin
@group_by(customer_id)
@mutate(
cohort = min(date)
)
end
min(df[!, :date])
for df_sum I receive the following error:
ERROR: ArgumentError: argument is not a permutation Stacktrace: [1] invperm(a::Vector{Int64}) @ Base .combinatorics.jl:282 [2] groupby(df::DataFrame, cols::Cols{Tuple{Symbol}}; sort::Bool, skipmissing::Bool) @ DataFrames C:path.juliapackagesDataFramesLteElsrcgroupeddataframegroupeddataframe.jl:264 [3] top-level scope @ path.jl:453
When attemtping to identify the min date in the data.frame I receive the error:
ERROR: MethodError: no method matching min(::Vector{Union{Missing, Dates.Date}})
Closest candidates are: min(::Any, ::Missing) @ Base missing.jl:134 min(::Any, ::Any) @ Base operators.jl:481
min(::Any, ::Any, ::Any, ::Any...) @ Base operators.jl:578 ...Stacktrace: [1] top-level scope @ c:pathscript.jl:28
Which indicates to me that min doesn t work where there is a Missing data type, but I m not sure how to solve from there.