 Please Note: This article is written for users of the following Microsoft Excel versions: 2007, 2010, 2013, 2016, 2019, and Excel in Office 365. If you are using an earlier version (Excel 2003 or earlier), this tip may not work for you. For a version of this tip written specifically for earlier versions of Excel, click here: Calculating Statistical Values on Different-Sized Subsets of Data.

# Calculating Statistical Values on Different-Sized Subsets of Data by Allen Wyatt
(last updated November 23, 2019)

Chris has a huge amount of data in a worksheet and he wants to analyze the data based on different groupings within it. For instance, he has data in cells A2:B36001, where row 1 contains the column headings Time and Signal. He wants to divide the data into groups consisting of some arbitrary number of sequential values, and then extract, for each group, a mean value for the Time, a mean value for the Signal, and a standard deviation for the Signal.

The easiest way to handle this type of requirement is to add a column that is used to indicate a group number for each row. Follow these steps:

1. Put the heading Group into cell C1.
2. In cell E1 enter the number of values that should be in each group. For instance, if you want each group to contain 10 sequential values, enter the number 10 in cell E1.
3. In cell C2 enter this formula: =INT((ROW()-ROW(\$C\$2))/\$E\$1)+1
4. Copy the formula in cell C2 to the range C3:C36001. Column C now contains a "group number" for each row, based on the value in cell E1. If E1 is 10, you end up with 3600 groups, 1 through 3600. If E1 is 100, you end up with 360 groups, 1 through 360.

With the group numbers set up, you are ready to do the analysis. There are a couple of ways you can do this. One way is to use the subtotaling capabilities of Excel. Select one of the cells in the data area and follow these steps:

1. Choose Subtotals from the Data menu. Excel displays the Subtotal dialog box.
2. Change the At Every Change In drop-down list to Group.
3. Change the Use Function drop-down list to indicate the type of statistic you want to calculate for each group.
4. Change the Add Subtotal To area so that only Time or Signal are selected, as appropriate.
5. Click OK.

Excel groups and subtotals the data, as directed. (This process may take a while depending on the size of your groups.) You can hide the detail and only show the subtotals by clicking on the small 2 (with the box around it) in the outline area at the left of the worksheet. If you later want to change what is calculated, or if you need to change the number of items in each group, just remove the subtotals (using the button in the Subtotal dialog box) and repeat the above steps.

Another way to derive the statistics from your data is to use a PivotTable. Make sure that there are no subtotals in the data and select a cell within the data. Then follow these steps:

1. Display the Insert tab of the ribbon.
2. Click the PivotTable tool. (This tool is the first one at the left of the Insert tab.) Excel displays the Create PivotTable dialog box.
3. Click OK. (The default options in the dialog box are just fine.) Excel creates a blank PivotTable and displays a field list at the right of the worksheet.
4. Drag the Group field to the Row Labels area, just below the field list.
5. Drag the Time field to the Values area, just below the field list.
6. Drag the Signal field to the Values area, just below the field list.
7. Drag the Signal field, once again, to the Values area. The PivotTable should now show "Count of Time," "Sum of Signal," and "Sum of Signal2."
8. In the Values area, click the "Count of Time" label. Excel displays a Context menu.
9. Choose Value Field Settings. Excel displays the Value Field Settings dialog box.
10. In the Summarize Value Field By list, choose Average.
11. Click OK. The "Count of Time" labels change to "Average of Time."
12. In the Values area, click the "Sum of Signal" label. Excel displays a Context menu.
13. Choose Value Field Settings. Excel displays the Value Field Settings dialog box.
14. In the Summarize Value Field By list, choose Average.
15. Click OK. The "Sum of Signal" labels change to "Average of Signal."
16. In the Values area, click the "Sum of Signal2" label. Excel displays a Context menu.
17. Choose Value Field Settings. Excel displays the Value Field Settings dialog box.
18. In the Summarize Value Field By list, choose StdDev.
19. Click OK. The "Sum of Signal" labels change to "StdDev of Signal."

You now have the data desired. If you need to change the number of data items in each group, just go back to the data worksheet and change cell E1 to a different value. You can then return to the PivotTable, display the Options tab of the ribbon, and click the Refresh button.

ExcelTips is your source for cost-effective Microsoft Excel training. This tip (8628) applies to Microsoft Excel 2007, 2010, 2013, 2016, 2019, and Excel in Office 365. You can find a version of this tip for the older menu interface of Excel here: Calculating Statistical Values on Different-Sized Subsets of Data.

##### Author Bio

Allen Wyatt

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He is president of Sharon Parq Associates, a computer and publishing services company. ...

##### MORE FROM ALLEN

Getting Rid of the Jaggies in WordArt

Sometimes the fonts you use in your WordArt creations can look smooth and clean on the screen, but when printed, have ...

Discover More

Ignoring Paragraph Marks when Pasting

Paste information in a worksheet, and you may end up with Excel placing it into lots of different cells. If you want it ...

Discover More

When processing some text data, you may need to perform some esoteric function, such as adding dashes between letters. ...

Discover More Comprehensive VBA Guide Visual Basic for Applications (VBA) is the language used for writing macros in all Office programs. This complete guide shows both professionals and novices how to master VBA in order to customize the entire Office suite for their needs. Check out Mastering VBA for Office 2010 today!

##### More ExcelTips (ribbon)

Cell Address of a Maximum Value

Finding the maximum value in a range of cells is easy; finding the address of the cell containing that value is a ...

Discover More

Formula Shows Instead of Formula Result

When you enter a formula in a cell, you expect Excel to display the result of that formula once you are completed. If ...

Discover More

Referencing the Last Six Items in a Formula

If you have a list of data in a column, you may want to determine an average of whatever the last few items are in the ...

Discover More
##### Subscribe

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

If you would like to add an image to your comment (not an avatar, but an image to help in making the point of your comment), include the characters [{fig}] in your comment text. You’ll be prompted to upload your image when you submit the comment. Maximum image size is 6Mpixels. Images larger than 600px wide or 1000px tall will be reduced. Up to three images may be included in a comment. All images are subject to review. Commenting privileges may be curtailed if inappropriate images are posted.

What is 4 - 3?

2018-01-26 11:02:28

Peter Atherton

John

I used a couple of formulas to get the data based on rand()

The formulas in D:E are:

Cell ref Formula
D2 =A2
D3 =D2+TIME(0,0,1)
E2 =COUNTIFS(\$A\$2:\$A\$20000,">="&D2,\$A\$2:\$A\$20000,"<="&D3)
F2 =(SUMIFS(\$B\$2:\$B\$20000,\$A\$2:\$A\$20000,">="&D2,\$A\$2:\$A\$20000,"<="&\$D3))/E2

(see Figure 1 below) Figure 1.

2018-01-26 11:01:09

Peter Atherton

John

I used a couple of formulas to get the data based on rand()

(see Figure 1 below)

The formulas in D:E are:

Cell ref Formula
D2 =A2
D3 =D2+TIME(0,0,1)
E2 =COUNTIFS(\$A\$2:\$A\$20000,">="&D2,\$A\$2:\$A\$20000,"<="&D3)
F2 =(SUMIFS(\$B\$2:\$B\$20000,\$A\$2:\$A\$20000,">="&D2,\$A\$2:\$A\$20000,"<="&\$D3))/E2 Figure 1.

2018-01-24 12:31:01

John

I have a similar problem except the number of values in each grouping varies and I have only 2 columns of data to work with.
I have about 20,000 rows of data. One of the 2 columns I am interested in is time (mm:ssss) which increments approximately 1 second every 4 to 14 rows. The other column is feet per second (fps) at the time in the time column and is very "noisy". To reduce the noise of the fps data I need to average the fps data over nearly one full second to reduce the "noise" in the instantaneous value.

What I need to do is key on the time column and when the truncated difference between row (n+x) - row( n) is >/= to 1, then average the fps column values of rows (n) through (n+x) and put the average in a new column at the (n+x) row after dividing it by the m.ssss in the differenced time column.
This gives me a column with the average fps value over the previous second which is a much less "noisy" value than the instantaneous values.

Doing this manually would take a couple of weeks.

2014-04-14 02:22:26

barouh

Thank you for this useful tip. And any ideas on what is the best way to solve the similar task, when we want to calculate not the means, but the medians for X subcategories?

2014-04-12 11:49:36

Willy Vanhaelen

ROW(\$C\$2) will always return 2 so instead of =INT((ROW()-ROW(\$C\$2))/\$E\$1)+1 you can better use =INT((ROW()-2)/\$E\$1)+1

##### This Site

Got a version of Excel that uses the ribbon interface (Excel 2007 or later)? This site is for you! If you use an earlier version of Excel, visit our ExcelTips site focusing on the menu interface.