SUM(col) OVER (ORDER BY date ROWS BETWEEN 4 PRECEDING and CURRENT ROW) is an easy way to calculate a rolling sum. Useful in problems like the last one you showed.
I think it could be also helpful if you can summarize leetcode python questions in basic algorithm which are good for data scientists to prepare for technical interviews. Some of the companies do ask basic algorithm questions during coding interviews.
Hi Emma! Good video overall. One small mistake in the Data Categorization section. '[10-15>' is tricky here since the above code won't return the answer as 0. Because we cannot case when something does not exist in the table. To do this, we might want to union all to consider 0 as the case.
thanks! looking forward to window functions vidoe!
Hi Emma! A very good video. But I think games_played_so_far will not show the running total. It will show total games played by some one on certain date. Reason: 1st Step: Join two tables 2nd Step: Extract rows using where clause (it would return rows from the derived/joined table and WILL EXECUTE ONLY ONCE in query for all the rows where a.event_date is either greater or equat to b.event_date) 3rd Step: Grouping data according to a.player_id, a.event_date 4th Step: Selecting a.player_id, a.event_date and SUM(b.games_played). Here summation gives the total number of games played by a player on certain date. So far as I know we do not need WHERE clause here. I think the condition should be changed in ON clause and it should be a.event_date >= b.event_date. I may be wrong. Please guide if I am.
Hello Emma, For the Categorization problem here is my solving method. 1) I create the table that I going to use fo the join DROP TABLE IF EXISTS bins_value CREATE TABLE bins_value ( bin_name varchar(20) ) INSERT INTO bins_value VALUES ('[0-5>'), ('[5-10>'), ('[10-15>'), ('[15-20>'), ('20 ans more') 2) The query to the problem WITH BINS as( SELECT CASE WHEN duration >= 0 and duration < 5 THEN '[0-5>' WHEN duration >= 5 and duration < 10 THEN '[5-10>' WHEN duration >= 10 and duration < 15 THEN '[10-15>' WHEN duration >=15 and duration < 20 THEN '[15-20>' ELSE '20 ans more' END as bin FROM SESSIONS ) SELECT a.bin_name, ISNULL(b.total,0) T_total FROM (SELECT bin, COUNT(*) total FROM BINS GROUP BY bin) as b right join bins_value as a on b.bin = a.bin_name
really a clear breakdown of the SQL problems, thanks!
Hello at 4:22 Customer_id 3 and Delivery_id have the same order_date and customer_pref_delivery_date. Why is Customer 3 not satisfying the criteria you outlined?
clear and concise, nice vid
Will be having a live SQL interview next week, the video helps a lot!
Wow, this is a great video!
One suggestion for the first order question: the "first_order" table technically doesn't only reflect customers' first orders. consider, for example, a customer whose first order is placed on 1/1/2022 for delivery on 1/2/2022, and then in their second order they place it also on 1/1/2022 but this one is in fact an immediate delivery (for 1/1/2022). they'll show up now in your first query, but their first order was not, in fact, an immediate order (their second order was).
Nice video! Thank you, Emma! One question would be: for the categorization example, will it be easier to use UNION to add the 0 record row (10-15 in this case, at 08:50)? I was not able to come up with/find an easy solution using JOIN here, could you point me to some possible answers? Thanks!
For Data Categorization, can you show me how to Left join to get the missing bin? I checked around and everyone uses a union to satisfy that condition.
Thank you Emma! Good stuff!
How long do we have to wait till your next video on window functions?
Not SQL-related but what do you think are the skills data scientists need to learn now aside from Python and SQL?
The only challenge that I have been facing with leetcode is the quality of test cases against which the solution is tested. Also, the solution sometimes doesn't go with what was asked in the first place!
too much hands movement
@emma_ding