Home » Categories » CHARTrunner

CR: Data grouping tardiness

Some CHARTrunner customers express the need to count or group data that has been accumulated during certain periods of time. Let’s pretend today is September 25, 2009 and consider the following tardiness data.

Employee Tardiness Incidents

Date

EmpID

MinsLate

6/3/2009

4

15

6/25/2009

2

20

6/26/2009

1

10

6/27/2009

2

30

6/27/2009

3

10

6/29/2009

4

5

6/30/2009

3

20

7/1/2009

1

20

7/2/2009

6

30

7/3/2009

6

10

7/10/2009

2

10

7/15/2009

3

5

7/30/2009

2

5

8/1/2009

2

10

8/10/2009

2

3

8/20/2009

5

50

8/27/2009

3

50

9/3/2009

1

30

9/5/2009

6

10

9/6/2009

6

20

9/10/2009

5

10

9/15/2009

6

21

Imagine that you want to view the number of tardiness incidents within the last 30 days, the 30 days before that, the 30 days before that, etc. To do this, we must transform the data above so the results look something like this:

TimeFrame

NumOfTardyEmp

Last 30 days

6

Last 30-60 days

4

Last 60-90 days

9

Last 90-120 days

3

This is useful information, but how do we do this in CHARTrunner and avoid massaging our raw data each time we want a new chart?

In CHARTrunner, in the chart definition form, on the data definition tab, you have these choices for where to get data:

Table
Query or view
Stored procedure
Custom query

When you select Custom query you can write an SQL query. Writing an SQL query is intimidating for some users but I suggest that you give it a try. It provides a lot of power and is not as difficult as you think. To solve the problem described above, we write a custom query.

A query to pull up all the information as shown above looks like this:

SELECT *
FROM [Tardiness Records$ ]
<- - - This is the worksheet in the selected excel file

Date

EmpID

MinsLate

6/3/2009

4

15

6/25/2009

2

20

6/26/2009

1

10

Etc….

This gives us the raw data just as we see in the first table above. The asterisk in the query says that we want to include all the data columns.

In the query, we can also name the result columns however we want. This this example, we name the result columns as part of the query:

SELECT Date AS IncidentDate,EmpID AS Employee
FROM [Tardiness Records$]

IncidentDate

Employee

6/3/2009

4

6/25/2009

2

6/26/2009

1

Etc…

Next, we group all the data together and count the total number of tardiness incidents by using the Count() function in the query:

SELECT ‘Total Incidents’ AS Category,Count(*) AS Num
FROM [Tardiness Records$]

Category

Num

Total Incidents

22

Next, consider this query.

SELECT 'Last 30 days' AS TimeFrame, Count(*) AS Num
FROM [Tardiness Records$]
WHERE Date > Date() – 30

TimeFrame

Num

Last 30 days

6

The Date() functioninserts the current date into the query. This helps keep the data and charts up to date. This part of the SQL statement: “where Date > Date() – 30” says: only select the rows that have dates greater than the current date minus 30. In other words, count the rows of data that occurred after 30 days ago.

We can do a similar query to get record from 30 to 60 days ago:

SELECT 'Last 30-60 days' AS TimeFrame,Count(*) AS Num
FROM [Tardiness Records$]
WHERE Date > Date() - 60
AND Date <= Date() – 30

TimeFrame

Num

Last 30-60 days

4

In the case above, the counted data must satisfy two conditions. It must be greater than the current date minus 60 days and less than or equal to the current date minus 30 days.

To reach the goal of 4 date range categories, we need the following four queries:

(1)
SELECT 'Last 30 days' AS TimeFrame,Count(*)AS NumOfTardyEmp
FROM [Tardiness Records$]
WHERE Date > Date() - 30

(2)
SELECT 'Last 30-60 days' AS TimeFrame,Count(*)AS NumOfTardyEmp
FROM [Tardiness Records$]
WHERE Date > Date() - 60
AND Date <= Date() - 30

(3)
SELECT 'Last 60-90 days' AS TimeFrame,Count(*)AS NumOfTardyEmp
FROM [Tardiness Records$]
WHERE Date > Date() - 90
AND Date <= Date() - 60

(4)
SELECT 'Last 90-120 days' AS TimeFrame,Count(*)AS NumOfTardyEmp
FROM [Tardiness Records$]
WHERE Date > Date() - 120
AND Date <= Date() – 90

To get a single, easily chartable, result set we combine these four queries into a single query by using the UNION keyword as shown below:

SELECT 'Last 30 days' AS TimeFrame,Count(*)AS Num
FROM [Tardiness Records$]
WHERE Date > Date() – 30
union
SELECT 'Last 30-60 days' AS TimeFrame,Count(*)AS Num
FROM [Tardiness Records$]
WHERE Date > Date() - 60
AND Date <= Date() – 30
union
SELECT 'Last 60-90 days' AS TimeFrame,Count(*)AS Num
FROM [Tardiness Records$]
WHERE Date > Date() - 90
AND Date <= Date() – 60
union
SELECT 'Last 90-120 days' AS TimeFrame,Count(*)AS Num
FROM [Tardiness Records$]
WHERE Date > Date() - 120
AND Date <= Date() – 90

Here are the results from our UNION query. To chart this in CHARTrunner, create a Pareto chart, treat the TimeFrame column as Categories and treat the Num column as Count.

TimeFrame

Num

Last 30 days

6

Last 30-60 days

4

Last 60-90 days

7

Last 90-120 days

5

See the chart definition form and the resulting chart below.

Queries can quite useful in transforming data. Often, the transformed data will be easier to visualize and understand. Grouping data into date intervals is one common pattern we have demonstrated here. However, keep in mind the power of queries to transform data. There are many other patterns that can lead you to a better visualization of your data.

Note the zip file which accompanies this article contains a CHARTrunner chart definition and an Excel worksheet that demonstrate these techniques. The dates in the Excel sheet may vary from what you see above based on the date you view them.

Attachments (2) Attachments
Related Articles RSS Feed
CR: Largest Subgroup Size Supported by CHARTrunner
Viewed 1982 times since Fri, Nov 9, 2007
CR: Histogram bars beyond specification lines
Viewed 2267 times since Fri, Nov 9, 2007
CR: Charting counted incidents
Viewed 1871 times since Thu, Feb 23, 2012
CR: Charts not working on a new PC
Viewed 1954 times since Fri, Nov 9, 2007
CR: Charts with large amounts of data
Viewed 1870 times since Fri, Nov 9, 2007
CR: CHARTrunner and SharePoint
Viewed 3899 times since Fri, Mar 9, 2012
CR: Excel tips
Viewed 2962 times since Tue, Oct 6, 2009
CR: Why is my first row of data not included when I get my chart from Excel
Viewed 2076 times since Fri, Nov 9, 2007
CR: Bar Chart of a column of numbers in various categories
Viewed 1918 times since Fri, Nov 9, 2007
CR: Multiple chart lines from one data column
Viewed 2180 times since Fri, Nov 9, 2007