Written by Allen Wyatt (last updated June 1, 2024)
This tip applies to Excel 2007, 2010, 2013, 2016, 2019, Excel in Microsoft 365, and 2021
Lucie has imported into a workbook data from her company's HR/payroll system that she needs to forward to a third-party vendor. Before she forwards it, however, she needs to compile a list of foreign-language characters contained in the employee names, such as umlauts and accents. She doesn't need to replace the characters; she just needs a list of them.
There are a couple of ways to do this. First, if you are using Excel 2021 or the version with Microsoft 365, you can put together a powerful formula that will pull the desired characters:
=LET(wl,A:A,cwl,TEXTJOIN(,TRUE,wl),cl,MID(cwl,SEQUENCE(LEN(cwl)),1), SORT(UNIQUE(FILTER(cl,CODE(cl)>127))))
Even though the formula is shown here in two lines, it is a single formula. It assumes that all of the employee names are in column A. You can place this formula in a cell in a different column, and as long as there is nothing in the cells below the formula, you'll have a sorted list of characters returned.
The formula works by concatenating all the names in column A into a single string (assigned to the variable cwl) and then examining each character in the string. If the character has a code value of 128 or greater, then it is considered a foreign character. (Those with codes below 128 are assumed to be non-foreign characters.)
The formula returns only unique characters, and those are sorted. Because the formula concatenates all the names (using the TEXTJOIN function), there is a limit on what it can process. If the combined length of all the characters is 32,767 or more, then a #CALC error is returned.
If you are not using the latest versions of Excel or you might have more than 32,767 characters you are examining, then you should consider using a macro. Here's an example of one that will go through all of the cells in a range and return the foreign-language characters:
Function ForeignChars(ByVal MyRange As Range) As String 'All characters from 0 to 127 are considered non-foreign 'All characters above 127 are considered foreign Dim c As Range Dim sTemp As String Dim sChars As String Dim J As Integer Dim K As Integer Dim bFound As Boolean Application.Volatile sChars = "" For Each c In MyRange sTemp = c.Text For J = 1 To Len(sTemp) If AscW(Mid(sTemp, J, 1)) > 127 Then bFound = False For K = 1 To Len(sChars) If Mid(sChars, K, 1) = Mid(sTemp, J, 1) Then bFound = True Next K If Not bFound Then sChars = sChars & Mid(sTemp, J, 1) End If End If Next J Next c ForeignChars = sChars End Function
In order to use the function, you could use this in your worksheet:
=ForeignChars(A1:A2500)
This checks the contents of the range designated (A1:A2500). The foreign-language characters are returned as a single string by the function. You'll also find that the macro returns a wider range of foreign characters than the earlier formula because the CODE worksheet function (used in the formula) evaluates text a bit differently than the AscW VBA function (used in the macro).
Note:
ExcelTips is your source for cost-effective Microsoft Excel training. This tip (767) applies to Microsoft Excel 2007, 2010, 2013, 2016, 2019, Excel in Microsoft 365, and 2021.
Solve Real Business Problems Master business modeling and analysis techniques with Excel and transform data into bottom-line results. This hands-on, scenario-focused guide shows you how to use the latest Excel tools to integrate data from multiple tables. Check out Microsoft Excel 2013 Data Analysis and Business Modeling today!
Want to sum the values in the same cell on a range of worksheets? It's not as easy as summing a range on the same ...
Discover MoreIf you have two columns containing dates and weights from those dates, you may want to pick a date associated with a ...
Discover MoreTrying to calculate how much people owe you? If you charge interest or service charges on past-due accounts, there are a ...
Discover MoreFREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."
2024-06-06 15:30:56
J. Woolley
As noted in my first comment below, Excel 2021's UNIQUE function is not case-sensitive. An equivalent using REDUCE, LAMBDA, and VSTACK is case-sensitive, but those functions currently require Excel 365.
My Excel Toolbox now includes the following function to return unique rows or columns from a cell range (contiguous) or array constant or result of an array function:
=UniquePlus(RangeArray, [LeftRight], [ExactlyOnce], [HasHeader],
[CaseSensitive])
The first three parameters match Excel's UNIQUE function.
When HasHeader is TRUE, the first row (LeftRight=FALSE) or first column (LeftRight=TRUE) is returned even if it is not unique.
Case-sensitive comparisons are done when CaseSensitive is TRUE.
This function does not require Excel 2021 or newer.
See https://sites.google.com/view/MyExcelToolbox
2024-06-02 16:48:50
J. Woolley
The Tip's ForeignChars function returns an unsorted string of characters; for example
ëÄäéÉË
The result will be case-sensitive as illustrated in that example if the function is in a module with the default
Option Compare Binary
But if the module begins with
Option Compare Text
the result will not be case-sensitive and the previous example would simply return
ëÄé
The following version returns a sorted case-sensitive comma-separated result for that example like this
Ä, É, Ë, ä, é, ë
regardless of Option Compare:
Function ForeignChars2(ByVal Target As Range) As String
'ASCII is 0 to 127; non-ASCII is "foreign"
Dim Chars As New Collection, char As String
Dim cell As Range, n As Integer, k As Integer
Const COMP As Integer = vbBinaryCompare 'case-sensitive
If Target.Cells.Count > 1 Then
Set Target = Application.Intersect(Target.Parent.UsedRange, Target)
End If
For Each cell In Target
If WorksheetFunction.IsText(cell) Then
For n = 1 To Len(cell)
char = Mid(cell, n, 1)
If AscW(char) > 127 Then 'non-ASCII
If Chars.Count = 0 Then
Chars.Add char 'add first
ElseIf StrComp(char, Chars(Chars.Count), COMP) > 0 Then
Chars.Add char 'add char > Chars(Chars.Count)
Else
For k = 1 To Chars.Count
Select Case StrComp(char, Chars(k), COMP)
Case 0 'already added char = Chars(k)
Exit For
Case Is < 0 'add char < Chars(k)
Chars.Add char, , k 'before Chars(k)
Exit For
End Select
Next k
End If
End If
Next n
End If
Next cell
If Chars.Count = 0 Then ForeignChars2 = "none": Exit Function
ForeignChars2 = Chars(1)
For k = 2 To Chars.Count
ForeignChars2 = ForeignChars2 & ", " & Chars(k)
Next k
End Function
2024-06-01 18:47:19
J. Woolley
The Tip's first formula is very clever:
=LET(wl, A:A, cwl, TEXTJOIN(, TRUE, wl), cl, MID(cwl, SEQUENCE(LEN(cwl)) ,1), SORT(UNIQUE(FILTER( cl, CODE(cl)>127))))
Here are some nit-picking comments:
1. TEXTJOIN(, TRUE, wl) could be replaced by CONCAT(wl) or CONCAT(A:A) for the same result. Using the latter means the wl parameter is unnecessary:
=LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)), 1), SORT(UNIQUE(FILTER( cl, CODE(cl)>127))))
2. The Tip's formula returns a vertical array with N rows and 1 column, where N is the number of unique "foreign-language" (non-ASCII) characters. Lucie might prefer a comma separated list in a single cell:
=LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)) ,1), TEXTJOIN(", ", TRUE, SORT(UNIQUE(FILTER(cl, CODE(cl)>127)))))
3. Unless specifically stated, most Excel functions are not case-sensitive when applied to text. Neither is the UNIQUE function; therefore,
UNIQUE({"a";"A";"B";"b"}) returns {"a";"B"}
a mixed-case array. Lucie might prefer all lower-case characters for uniform appearance:
=LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)) ,1), TEXTJOIN(", ", TRUE, SORT(UNIQUE(LOWER(FILTER(cl, CODE(cl)>127))))))
4. Here is the same formula with the UNIQUE(...) part isolated for further discussion:
=LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)),1), uniq, UNIQUE(LOWER(FILTER(cl, CODE(cl)>127))), TEXTJOIN(", ", TRUE, SORT(uniq)))
5. Lucie might want to list both upper-case and lower-case characters if both appear in the names. The following article describes a case-sensitive equivalent of Excel's UNIQUE function:
https://exceljet.net/formulas/unique-values-case-sensitive
This version of the previous formula includes each unique case-sensitive "foreign-language" character:
=LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)), 1), uniq, REDUCE(, FILTER(cl, CODE(cl)>127), LAMBDA(a, v, IF(SUM(--EXACT(a, v)), a, VSTACK(a, v)))), TEXTJOIN(", ", TRUE, SORT(uniq)))
6. Like UNIQUE, the SORT function is not case-sensitive. But the SortPlus function in My Excel Toolbox has a case-sensitive option:
=LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)), 1), uniq, REDUCE(, FILTER(cl, CODE(cl)>127), LAMBDA(a, v, IF(SUM(--EXACT(a, v)), a, VSTACK(a, v)))), TEXTJOIN(", ", TRUE, SortPlus(uniq, , , , , TRUE)))
7. SortPlus also has an option for ANSI/ASCII/Unicode order instead of Excel order; that option sorts upper-case before lower-case:
=LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)), 1), uniq, REDUCE(, FILTER(cl, CODE(cl)>127), LAMBDA(a, v, IF(SUM(--EXACT(a, v)), a, VSTACK(a, v)))), TEXTJOIN(", ", TRUE, SortPlus(uniq, , , , , TRUE, TRUE)))
Each of these formulas can be copy/pasted into a worksheet for testing.
For more on SortPlus, see https://excelribbon.tips.net/T012575
and https://sites.google.com/view/MyExcelToolbox
Got a version of Excel that uses the ribbon interface (Excel 2007 or later)? This site is for you! If you use an earlier version of Excel, visit our ExcelTips site focusing on the menu interface.
FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."
Copyright © 2024 Sharon Parq Associates, Inc.
Comments