Please Note: This article is written for users of the following Microsoft Excel versions: 2007, 2010, 2013, 2016, 2019, Excel in Microsoft 365, and 2021. If you are using an earlier version (Excel 2003 or earlier), this tip may not work for you. For a version of this tip written specifically for earlier versions of Excel, click here: Deleting Duplicate Columns.
Written by Allen Wyatt (last updated May 11, 2024)
This tip applies to Excel 2007, 2010, 2013, 2016, 2019, Excel in Microsoft 365, and 2021
Dror has a worksheet that contains quite a bit of data. It is possible that the data in one column will be exactly the same as the data in another column, so he wonders if there is an easy way to delete any duplicate columns within the worksheet.
The first step, of course, is to figure out if two columns are identical or not. This can be determined rather easily with a formula such as the following:
=AND(A1:A100=B1:B100)
If you are using an older version of Excel (Excel 2016 or older), then this must be entered as an array formula using Shift+Ctrl+Enter. In the newest versions of Excel, this is not necessary. The formula compares all the values in the first 100 rows of columns A and B. If they are all the same, then the formula returns TRUE. If any of the cells don't match, then the formula returns FALSE. If the result is TRUE you could then delete one of the columns because they are the same.
If you want something that is a bit more automatic, meaning that the duplicate column is deleted, then you'll need to use a macro. The following steps through all the columns in the worksheet and, starting with the right-most column, compares all the columns. If any are the same—regardless of their order in the worksheet—then the macro asks if you want the duplicate column deleted.
Sub DeleteDuplicateColumns() Dim rngData As Range Dim arr1, arr2 Dim i As Integer, j As Integer, n As Integer On Error Resume Next Set rngData = ActiveSheet.UsedRange If rngData Is Nothing Then Exit Sub n = rngData.Columns.Count For i = n To 2 Step -1 For j = i - 1 To 1 Step -1 If WorksheetFunction.CountA(rngData.Columns(i)) <> 0 And _ WorksheetFunction.CountA(rngData.Columns(j)) <> 0 Then arr1 = rngData.Columns(i) arr2 = rngData.Columns(j) If AreEqualArr(arr1, arr2) Then With rngData.Columns(j) 'mark column to be deleted .Copy If MsgBox("Delete marked column?", vbYesNo) _ = vbYes Then rngData.Columns(j).Delete Else 'remove mark Application.CutCopyMode = False End If End With End If End If Next j Next i End Sub
Function AreEqualArr(arr1, arr2) As Boolean Dim i As Long, n As Long AreEqualArr = False For n = LBound(arr1) To UBound(arr1) If arr1(n, 1) <> arr2(n, 1) Then Exit Function End If Next n AreEqualArr = True End Function
Note:
ExcelTips is your source for cost-effective Microsoft Excel training. This tip (5674) applies to Microsoft Excel 2007, 2010, 2013, 2016, 2019, Excel in Microsoft 365, and 2021. You can find a version of this tip for the older menu interface of Excel here: Deleting Duplicate Columns.
Create Custom Apps with VBA! Discover how to extend the capabilities of Office 2013 (Word, Excel, PowerPoint, Outlook, and Access) with VBA programming, using it for writing macros, automating Office applications, and creating custom applications. Check out Mastering VBA for Office 2013 today!
When entering data in a worksheet, you may only want to add information to the cells in a particular range. You can ...
Discover MoreWhen entering information in a worksheet, it is common to also note a date or time corresponding to the entry. There are ...
Discover MoreDo you want to limit what can be entered into a particular cell in your worksheet? Here are three separate ways you can ...
Discover MoreFREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."
2024-05-14 11:01:42
J. Woolley
Re. my previous comment below, the DeleteDuplicateColumns2 macro considers hidden columns but ignores hidden rows. It would be better to make any hidden rows visible before running the macro then restore their original status when finished. Here's one way to do that:
Sub DeleteDuplicateColumns3()
Dim rng As Range, nRows As Long, n As Long, hide() As Boolean
If Not (TypeOf ActiveSheet Is Worksheet) Then Exit Sub
Set rng = ActiveSheet.UsedRange 'used range (subject to change)
With rng 'fixed range equal to original used range
nRows = .Rows.Count
ReDim hide(1 To nRows)
For n = 1 To nRows
hide(n) = .Rows(n).Hidden 'original status
.Rows(n).Hidden = False 'make visible
Next n
Call DeleteDuplicateColumns2
For n = 1 To nRows
.Rows(n).Hidden = hide(n) 'restore status
Next n
End With
End Sub
2024-05-13 14:47:01
J. Woolley
Re. the Tip's macro, I have some nits to pick:
1. The reason for the following statement is not explained; it hides potential bugs and should be reconsidered:
On Error Resume Next
2. ActiveSheet.UsedRange is never Nothing; an empty sheet returns cell $A$1. Therefore, the following statement is unnecessary:
If rngData Is Nothing Then Exit Sub
3. Using .Copy to mark a column has no visible effect when the MsgBox is displayed; use .Select instead. Therefore, the following statement is unnecessary:
Application.CutCopyMode = False
4. A column marked for deletion might be off-screen or hidden when the MsgBox is displayed, and it might be separated from its matching column by several intervening columns.
5. The following statement:
rngData.Columns(j).Delete
should be replaced by
.EntireColumn.Delete
but the result is the same because Excel shifts a partial column with height rngData.Rows.Count to the left.
6. Columns are deleted right-to-left, but empty columns are not deleted; therefore, empty columns accumulate on the left.
7. Function AreEqualArr assumes Variants arr1 and arr2 are 2D base-1 single column arrays with equal length. This is true, but only because each represents a single worksheet column limited by rngData.
8. Columns are assumed equal if their values are equal even if their formulas are different.
The following version addresses these issues:
Sub DeleteDuplicateColumns2()
Dim sel As Range, rng As Range, ans As Variant
Dim nCols As Long, i As Long, j As Long, k As Long
Dim skip() As Boolean, hide() As Boolean, dele() As Boolean
If Not (TypeOf ActiveSheet Is Worksheet) Then Exit Sub
Set sel = Selection 'original Selection
Set rng = ActiveSheet.UsedRange
With rng 'original used range (subject to change)
nCols = .Columns.Count
ReDim skip(1 To nCols), hide(1 To nCols), dele(1 To nCols)
For i = 1 To nCols
skip(i) = (WorksheetFunction.CountA(.Columns(i)) = 0) 'empty
hide(i) = .Columns(i).Hidden 'original status
dele(i) = False
Next i
For i = 1 To nCols - 1 'skipped if nCols = 1
If Not (skip(i) Or dele(i)) Then
.Columns(i).Hidden = False 'make visible
.Columns(i).Cells(1).Show
For j = i + 1 To nCols
If Not (skip(j) Or dele(j)) Then
If ColumnsAreEqual(.Columns(i), .Columns(j)) Then
For k = i + 1 To j - 1 'skipped if i + 1 > j - 1
.Columns(k).Hidden = True 'make invisible
Next k
With .Columns(j)
.Hidden = False 'make visible
.Cells(1).Show
.Select
ans = MsgBox("Delete this column?", vbYesNo)
If ans = vbYes Then
dele(j) = True 'pending deletion
hide(j) = True 'pending deletion
End If
.Hidden = hide(j) 'restore status
End With
For k = i + 1 To j - 1 'skipped if i + 1 > j - 1
.Columns(k).Hidden = hide(k) 'restore status
Next k
End If
End If
Next j
.Columns(i).Hidden = hide(i) 'restore status
End If
Next i
sel.Select 'restore Selection
For i = nCols To 2 Step -1 'skipped if nCols = 1
If dele(i) Then
.Columns(i).EntireColumn.Delete
If i = nCols Then
hide(i) = False 'beyond used range
Else
hide(i) = hide(i + 1)
End If
.Columns(i).Hidden = hide(i)
End If
Next i
End With
End Sub
Function ColumnsAreEqual(C1 As Range, C2 As Range) As Boolean
'C1 and C2 must be single columns with equal length and same last row
'a merged cell in only one column does not necessarily make it unequal
'default return value is False
Dim n As Long
On Error Resume Next
If Split(C1.Address, "$")(4) <> Split(C2.Address, "$")(4) Or _
C1.Columns.Count <> 1 Or C2.Columns.Count <> 1 Or _
C1.Rows.Count <> C2.Rows.Count Then Exit Function
If Err Then Exit Function
On Error GoTo 0
For n = 1 To C1.Rows.Count
If C1.Cells(n).Formula <> C2.Cells(n).Formula Then Exit Function
If C1.Cells(n).Value <> C2.Cells(n).Value Then Exit Function
Next n
ColumnsAreEqual = True
End Function
Got a version of Excel that uses the ribbon interface (Excel 2007 or later)? This site is for you! If you use an earlier version of Excel, visit our ExcelTips site focusing on the menu interface.
FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."
Copyright © 2024 Sharon Parq Associates, Inc.
Comments