Create Calculated Column in a DataFrame

Input

first_name last_name country grad_date wedding_date income
Larry Smith US 5/3/1985 7/3/1988 140000
Rachel Johnson US 1/31/1988 5/30/1987 180000
Jason Williams CA 2/1/1990 94500
David Jones DE 7/3/1983 130000
Marry Brown US 134000
Jacob Davis US 4/30/1978 4/30/1979 158000
Wayne Miller DE 10/5/1985 5/3/1982 95000
Logan Wilson AU 12/7/1975 73000

Questions

Create a calculated column that:

  1. display the earliest date across multiple date columns.
  2. display the first not null value of multiple columns.
  3. display the grouped aggregated values (equivalent to sum/max/…() over (partition by … order by …) in SQL)

Code with Github Link

Github Link

You may also like...