If you work with data much, you don't need a statistical model to predict that the odds of consistently getting data in the format you need for analysis are pretty low. Those who do a great deal of data cleaning and reformatting often turn to scripting languages like Python or specialty tools such as OpenRefine or R.
But it turns out that there's a lot of data munging you can do in a plain old Excel spreadsheet -- if you know how to craft the proper formulas.
In a presentation at the recent 2014 Computer Assisted Reporting (CAR) conference, MaryJo Webster, senior data reporter with Digital First Media -- a newspaper group in New York -- shared some of her favorite Excel tricks. The goal of these tips, Webster said: Learn at least one new thing that will make you say, "Why didn't I know this before?"
Tip 1: Split dates into separate fields
You can extract the year, month and day into separate fields from a date field in Excel by using formulas =Year(CellWithDate), =MONTH(CellWithDate) and =DAY(CellWithDate). Splitting dates this way -- by year, month and day of month -- works in Microsoft Access as well, Webster said.
In addition, you can also get the day of the week for any date in Excel with =WEEKDAY(CellWithDate). The default returns numbers, not names of the days of week, with 1 for Sunday, 2 for Monday and so on.
To display the name of the weekday instead of a number, apply a custom format to the cells with the weekday numbers, using Format cells > Custom; then type ddd in the Type text box to get three-day abbreviations or dddd for the full day name.
Tip 2: Find someone's current age
If you have someone's date of birth, you can find his or her current age on whatever day you open the spreadsheet with the =DATEDIF() and =TODAY() functions. TODAY(), as you might guess, gives the current date. DATEDIF() gives the difference between two dates in units of years ("y"), months ("m") or days ("d"), using the syntax:
=DATEDIF(Date1, Date2, Unit of measure)
So, to get current age in years, use the formula:
Note that the years unit returns ages in whole numbers and does not round up.
See an example below.