Please Note: This article is written for users of the following Microsoft Excel versions: 2007, 2010, and 2013. If you are using an earlier version (Excel 2003 or earlier), this tip may not work for you. For a version of this tip written specifically for earlier versions of Excel, click here: Calculating Statistical Values on Different-Sized Subsets of Data.

# Calculating Statistical Values on Different-Sized Subsets of Data

by Allen Wyatt
(last updated January 24, 2018)

Chris has a huge amount of data in a worksheet and he wants to analyze the data based on different groupings within it. For instance, he has data in cells A2:B36001, where row 1 contains the column headings Time and Signal. He wants to divide the data into groups consisting of some arbitrary number of sequential values, and then extract, for each group, a mean value for the Time, a mean value for the Signal, and a standard deviation for the Signal.

The easiest way to handle this type of requirement is to add a column that is used to indicate a group number for each row. Follow these steps:

1. Put the heading Group into cell C1.
2. In cell E1 enter the number of values that should be in each group. For instance, if you want each group to contain 10 sequential values, enter the number 10 in cell E1.
3. In cell C2 enter this formula: =INT((ROW()-ROW(\$C\$2))/\$E\$1)+1
4. Copy the formula in cell C2 to the range C3:C36001. Column C now contains a "group number" for each row, based on the value in cell E1. If E1 is 10, you end up with 3600 groups, 1 through 3600. If E1 is 100, you end up with 360 groups, 1 through 360.

With the group numbers set up, you are ready to do the analysis. There are a couple of ways you can do this. One way is to use the subtotaling capabilities of Excel. Select one of the cells in the data area and follow these steps:

1. Choose Subtotals from the Data menu. Excel displays the Subtotal dialog box.
2. Change the At Every Change In drop-down list to Group.
3. Change the Use Function drop-down list to indicate the type of statistic you want to calculate for each group.
4. Change the Add Subtotal To area so that only Time or Signal are selected, as appropriate.
5. Click OK.

Excel groups and subtotals the data, as directed. (This process may take a while depending on the size of your groups.) You can hide the detail and only show the subtotals by clicking on the small 2 (with the box around it) in the outline area at the left of the worksheet. If you later want to change what is calculated, or if you need to change the number of items in each group, just remove the subtotals (using the button in the Subtotal dialog box) and repeat the above steps.

Another way to derive the statistics from your data is to use a PivotTable. Make sure that there are no subtotals in the data and select a cell within the data. Then follow these steps:

1. Display the Insert tab of the ribbon.
2. Click the PivotTable tool. (This tool is the first one at the left of the Insert tab.) Excel displays the Create PivotTable dialog box.
3. Click OK. (The default options in the dialog box are just fine.) Excel creates a blank PivotTable and displays a field list at the right of the worksheet.
4. Drag the Group field to the Row Labels area, just below the field list.
5. Drag the Time field to the Values area, just below the field list.
6. Drag the Signal field to the Values area, just below the field list.
7. Drag the Signal field, once again, to the Values area. The PivotTable should now show "Count of Time," "Sum of Signal," and "Sum of Signal2."
8. In the Values area, click the "Count of Time" label. Excel displays a Context menu.
9. Choose Value Field Settings. Excel displays the Value Field Settings dialog box.
10. In the Summarize Value Field By list, choose Average.
11. Click OK. The "Count of Time" labels change to "Average of Time."
12. In the Values area, click the "Sum of Signal" label. Excel displays a Context menu.
13. Choose Value Field Settings. Excel displays the Value Field Settings dialog box.
14. In the Summarize Value Field By list, choose Average.
15. Click OK. The "Sum of Signal" labels change to "Average of Signal."
16. In the Values area, click the "Sum of Signal2" label. Excel displays a Context menu.
17. Choose Value Field Settings. Excel displays the Value Field Settings dialog box.
18. In the Summarize Value Field By list, choose StdDev.
19. Click OK. The "Sum of Signal" labels change to "StdDev of Signal."

You now how the data desired. If you need to change the number of data items in each group, just go back to the data worksheet and change cell E1 to a different value. You can then return to the PivotTable, display the Options tab of the ribbon, and click the Refresh button.

ExcelTips is your source for cost-effective Microsoft Excel training. This tip (8628) applies to Microsoft Excel 2007, 2010, and 2013. You can find a version of this tip for the older menu interface of Excel here: Calculating Statistical Values on Different-Sized Subsets of Data.

##### Author Bio

Allen Wyatt

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He is president of Sharon Parq Associates, a computer and publishing services company. ...

##### MORE FROM ALLEN

Finding and Changing Word's Internal Commands

If you know how to create macros, you can easily create entire replacements for Word's internal commands. Here's all you ...

Discover More

Printing Multiple Label Copies when Merging

Need to print more than one copy of mail-merge labels? There are a number of different approaches you can take to getting ...

Discover More

How Operators are Evaluated

Operators are used in formulas to instruct Excel what to do to arrive at a result. Not all operators are evaluated in the ...

Discover More

Save Time and Supercharge Excel! Automate virtually any routine task and save yourself hours, days, maybe even weeks. Then, learn how to make Excel do things you thought were simply impossible! Mastering advanced Excel macros has never been easier. Check out Excel 2010 VBA and Macros today!

##### More ExcelTips (ribbon)

Relative References within Named Ranges

Excel is usually more flexible in what you can reference in formulas than is immediately apparent. This tip examines some ...

Discover More

Last Non-Zero Value in a Row

If you have a lot of values in a single row, you might want to pull the last non-zero value from that row. There are a ...

Discover More

Calculating an IRR with Varying Interest Rates

You might wonder how you can calculate an IRR (internal rate of return) when the person repaying the loan pays different ...

Discover More
##### Subscribe

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

View most recent newsletter.

##### Comments

If you would like to add an image to your comment (not an avatar, but an image to help in making the point of your comment), include the characters [{fig}] in your comment text. Youâ€™ll be prompted to upload your image when you submit the comment. Maximum image size is 6Mpixels. Images larger than 600px wide or 1000px tall will be reduced. Up to three images may be included in a comment. All images are subject to review. Commenting privileges may be curtailed if inappropriate images are posted.

What is nine less than 9?

2018-01-26 11:02:28

Peter Atherton

John

I used a couple of formulas to get the data based on rand()

The formulas in D:E are:

Cell ref Formula
D2 =A2
D3 =D2+TIME(0,0,1)
E2 =COUNTIFS(\$A\$2:\$A\$20000,">="&D2,\$A\$2:\$A\$20000,"<="&D3)
F2 =(SUMIFS(\$B\$2:\$B\$20000,\$A\$2:\$A\$20000,">="&D2,\$A\$2:\$A\$20000,"<="&\$D3))/E2

(see Figure 1 below)

Figure 1.

2018-01-26 11:01:09

Peter Atherton

John

I used a couple of formulas to get the data based on rand()

(see Figure 1 below)

The formulas in D:E are:

Cell ref Formula
D2 =A2
D3 =D2+TIME(0,0,1)
E2 =COUNTIFS(\$A\$2:\$A\$20000,">="&D2,\$A\$2:\$A\$20000,"<="&D3)
F2 =(SUMIFS(\$B\$2:\$B\$20000,\$A\$2:\$A\$20000,">="&D2,\$A\$2:\$A\$20000,"<="&\$D3))/E2

Figure 1.

2018-01-24 12:31:01

John

I have a similar problem except the number of values in each grouping varies and I have only 2 columns of data to work with.
I have about 20,000 rows of data. One of the 2 columns I am interested in is time (mm:ssss) which increments approximately 1 second every 4 to 14 rows. The other column is feet per second (fps) at the time in the time column and is very "noisy". To reduce the noise of the fps data I need to average the fps data over nearly one full second to reduce the "noise" in the instantaneous value.

What I need to do is key on the time column and when the truncated difference between row (n+x) - row( n) is >/= to 1, then average the fps column values of rows (n) through (n+x) and put the average in a new column at the (n+x) row after dividing it by the m.ssss in the differenced time column.
This gives me a column with the average fps value over the previous second which is a much less "noisy" value than the instantaneous values.

Doing this manually would take a couple of weeks.

2014-04-14 02:22:26

barouh

Thank you for this useful tip. And any ideas on what is the best way to solve the similar task, when we want to calculate not the means, but the medians for X subcategories?

2014-04-12 11:49:36

Willy Vanhaelen

ROW(\$C\$2) will always return 2 so instead of =INT((ROW()-ROW(\$C\$2))/\$E\$1)+1 you can better use =INT((ROW()-2)/\$E\$1)+1

##### This Site

Got a version of Excel that uses the ribbon interface (Excel 2007 or later)? This site is for you! If you use an earlier version of Excel, visit our ExcelTips site focusing on the menu interface.

##### Subscribe

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

(Your e-mail address is not shared with anyone, ever.)