Extracting Foreign-Language Characters

Written by Allen Wyatt (last updated June 1, 2024)

4

Lucie has imported into a workbook data from her company's HR/payroll system that she needs to forward to a third-party vendor. Before she forwards it, however, she needs to compile a list of foreign-language characters contained in the employee names, such as umlauts and accents. She doesn't need to replace the characters; she just needs a list of them.

There are a couple of ways to do this. First, if you are using Excel 2021 or the version with Microsoft 365, you can put together a powerful formula that will pull the desired characters:

=LET(wl,A:A,cwl,TEXTJOIN(,TRUE,wl),cl,MID(cwl,SEQUENCE(LEN(cwl)),1),
SORT(UNIQUE(FILTER(cl,CODE(cl)>127))))

Even though the formula is shown here in two lines, it is a single formula. It assumes that all of the employee names are in column A. You can place this formula in a cell in a different column, and as long as there is nothing in the cells below the formula, you'll have a sorted list of characters returned.

The formula works by concatenating all the names in column A into a single string (assigned to the variable cwl) and then examining each character in the string. If the character has a code value of 128 or greater, then it is considered a foreign character. (Those with codes below 128 are assumed to be non-foreign characters.)

The formula returns only unique characters, and those are sorted. Because the formula concatenates all the names (using the TEXTJOIN function), there is a limit on what it can process. If the combined length of all the characters is 32,767 or more, then a #CALC error is returned.

If you are not using the latest versions of Excel or you might have more than 32,767 characters you are examining, then you should consider using a macro. Here's an example of one that will go through all of the cells in a range and return the foreign-language characters:

Function ForeignChars(ByVal MyRange As Range) As String
    'All characters from 0 to 127 are considered non-foreign
    'All characters above 127 are considered foreign

    Dim c As Range
    Dim sTemp As String
    Dim sChars As String
    Dim J As Integer
    Dim K As Integer
    Dim bFound As Boolean

    Application.Volatile
    sChars = ""
    For Each c In MyRange
        sTemp = c.Text
        For J = 1 To Len(sTemp)
            If AscW(Mid(sTemp, J, 1)) > 127 Then
                bFound = False
                For K = 1 To Len(sChars)
                    If Mid(sChars, K, 1) = Mid(sTemp, J, 1) Then bFound = True
                Next K
                If Not bFound Then
                    sChars = sChars & Mid(sTemp, J, 1)
                End If
            End If
        Next J
    Next c
    ForeignChars = sChars
End Function

In order to use the function, you could use this in your worksheet:

=ForeignChars(A1:A2500)

This checks the contents of the range designated (A1:A2500). The foreign-language characters are returned as a single string by the function. You'll also find that the macro returns a wider range of foreign characters than the earlier formula because the CODE worksheet function (used in the formula) evaluates text a bit differently than the AscW VBA function (used in the macro).

Note:

If you would like to know how to use the macros described on this page (or on any other page on the ExcelTips sites), I've prepared a special page that includes helpful information. Click here to open that special page in a new browser tab.

ExcelTips is your source for cost-effective Microsoft Excel training. This tip (767) applies to Microsoft Excel 2007, 2010, 2013, 2016, 2019, 2021, and Excel in Microsoft 365.

Author Bio

Allen Wyatt

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He is president of Sharon Parq Associates, a computer and publishing services company. ...

MORE FROM ALLEN

Enlarging Screen Font Size

Sometimes things appearing on the screen are a bit too small to read easily. One possible solution is to adjust the size ...

Discover More

Margin Notes in Word

Some types of documents rely upon margin notes to the left or right of your main text. Getting these to appear in Word ...

Discover More

Understanding Functions in Macros

Functions are a common programming construct. They help you to create easy ways of processing information and returning a ...

Discover More

Professional Development Guidance! Four world-class developers offer start-to-finish guidance for building powerful, robust, and secure applications with Excel. The authors show how to consistently make the right design decisions and make the most of Excel's powerful features. Check out Professional Excel Development today!

More ExcelTips (ribbon)

Generating a Unique ID Number

If you are keeping track of people or things within Excel, you may want to devise unique ID numbers you can use for those ...

Discover More

Limiting a Calculated Value to a Range

If you want to limit what is returned by a formula to something between lower and upper boundaries, the solution is to ...

Discover More

Deriving Antilogs

Creating math formulas is a particular strong point of Excel. Not all the functions that you may need are built directly ...

Discover More
Subscribe

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

View most recent newsletter.

Comments

If you would like to add an image to your comment (not an avatar, but an image to help in making the point of your comment), include the characters [{fig}] (all 7 characters, in the sequence shown) in your comment text. You’ll be prompted to upload your image when you submit the comment. Maximum image size is 6Mpixels. Images larger than 600px wide or 1000px tall will be reduced. Up to three images may be included in a comment. All images are subject to review. Commenting privileges may be curtailed if inappropriate images are posted.

What is four more than 7?

2025-11-09 01:53:30

sandeep

Wonderful!


2024-06-06 15:30:56

J. Woolley

As noted in my first comment below, Excel 2021's UNIQUE function is not case-sensitive. An equivalent using REDUCE, LAMBDA, and VSTACK is case-sensitive, but those functions currently require Excel 365.nMy Excel Toolbox now includes the following function to return unique rows or columns from a cell range (contiguous) or array constant or result of an array function:n    =UniquePlus(RangeArray, [LeftRight], [ExactlyOnce], [HasHeader], n        [CaseSensitive])nThe first three parameters match Excel's UNIQUE function.nWhen HasHeader is TRUE, the first row (LeftRight=FALSE) or first column (LeftRight=TRUE) is returned even if it is not unique.nCase-sensitive comparisons are done when CaseSensitive is TRUE.nThis function does not require Excel 2021 or newer.nSee https://sites.google.com/view/MyExcelToolbox


2024-06-02 16:48:50

J. Woolley

The Tip's ForeignChars function returns an unsorted string of characters; for examplen    ëÄäéÉËnThe result will be case-sensitive as illustrated in that example if the function is in a module with the defaultn    Option Compare BinarynBut if the module begins with n    Option Compare Textnthe result will not be case-sensitive and the previous example would simply returnn    ëÄénThe following version returns a sorted case-sensitive comma-separated result for that example like thisn    Ä, É, Ë, ä, é, ënregardless of Option Compare:nnFunction ForeignChars2(ByVal Target As Range) As Stringn'ASCII is 0 to 127; non-ASCII is "foreign"n    Dim Chars As New Collection, char As Stringn    Dim cell As Range, n As Integer, k As Integern    Const COMP As Integer = vbBinaryCompare 'case-sensitiven    If Target.Cells.Count > 1 Thenn        Set Target = Application.Intersect(Target.Parent.UsedRange, Target)n    End Ifn    For Each cell In Targetn        If WorksheetFunction.IsText(cell) Thenn            For n = 1 To Len(cell)n                char = Mid(cell, n, 1)n                If AscW(char) > 127 Then 'non-ASCIIn                    If Chars.Count = 0 Thenn                        Chars.Add char 'add firstn                    ElseIf StrComp(char, Chars(Chars.Count), COMP) > 0 Thenn                        Chars.Add char 'add char > Chars(Chars.Count)n                    Elsen                        For k = 1 To Chars.Countn                            Select Case StrComp(char, Chars(k), COMP)n                            Case 0 'already added char = Chars(k)n                                Exit Forn                            Case Is < 0 'add char < Chars(k)n                                Chars.Add char, , k 'before Chars(k)n                                Exit Forn                            End Selectn                        Next kn                    End Ifn                End Ifn            Next nn        End Ifn    Next celln    If Chars.Count = 0 Then ForeignChars2 = "none": Exit Functionn    ForeignChars2 = Chars(1)n    For k = 2 To Chars.Countn        ForeignChars2 = ForeignChars2 & ", " & Chars(k)n    Next knEnd Function


2024-06-01 18:47:19

J. Woolley

The Tip's first formula is very clever:n    =LET(wl, A:A, cwl, TEXTJOIN(, TRUE, wl), cl, MID(cwl, SEQUENCE(LEN(cwl)) ,1), SORT(UNIQUE(FILTER( cl, CODE(cl)>127))))nHere are some nit-picking comments:n1. TEXTJOIN(, TRUE, wl) could be replaced by CONCAT(wl) or CONCAT(A:A) for the same result. Using the latter means the wl parameter is unnecessary:n    =LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)), 1), SORT(UNIQUE(FILTER( cl, CODE(cl)>127))))n2. The Tip's formula returns a vertical array with N rows and 1 column, where N is the number of unique "foreign-language" (non-ASCII) characters. Lucie might prefer a comma separated list in a single cell:n    =LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)) ,1), TEXTJOIN(", ", TRUE, SORT(UNIQUE(FILTER(cl, CODE(cl)>127)))))n3. Unless specifically stated, most Excel functions are not case-sensitive when applied to text. Neither is the UNIQUE function; therefore, n    UNIQUE({"a";"A";"B";"b"}) returns {"a";"B"}na mixed-case array. Lucie might prefer all lower-case characters for uniform appearance:n    =LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)) ,1), TEXTJOIN(", ", TRUE, SORT(UNIQUE(LOWER(FILTER(cl, CODE(cl)>127))))))n4. Here is the same formula with the UNIQUE(...) part isolated for further discussion:n    =LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)),1), uniq, UNIQUE(LOWER(FILTER(cl, CODE(cl)>127))), TEXTJOIN(", ", TRUE, SORT(uniq)))n5. Lucie might want to list both upper-case and lower-case characters if both appear in the names. The following article describes a case-sensitive equivalent of Excel's UNIQUE function:nhttps://exceljet.net/formulas/unique-values-case-sensitivenThis version of the previous formula includes each unique case-sensitive "foreign-language" character:n    =LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)), 1), uniq, REDUCE(, FILTER(cl, CODE(cl)>127), LAMBDA(a, v, IF(SUM(--EXACT(a, v)), a, VSTACK(a, v)))), TEXTJOIN(", ", TRUE, SORT(uniq)))n6. Like UNIQUE, the SORT function is not case-sensitive. But the SortPlus function in My Excel Toolbox has a case-sensitive option:n    =LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)), 1), uniq, REDUCE(, FILTER(cl, CODE(cl)>127), LAMBDA(a, v, IF(SUM(--EXACT(a, v)), a, VSTACK(a, v)))), TEXTJOIN(", ", TRUE, SortPlus(uniq, , , , , TRUE)))n7. SortPlus also has an option for ANSI/ASCII/Unicode order instead of Excel order; that option sorts upper-case before lower-case:n    =LET(cwl, CONCAT(A:A), cl, MID(cwl, SEQUENCE(LEN(cwl)), 1), uniq, REDUCE(, FILTER(cl, CODE(cl)>127), LAMBDA(a, v, IF(SUM(--EXACT(a, v)), a, VSTACK(a, v)))), TEXTJOIN(", ", TRUE, SortPlus(uniq, , , , , TRUE, TRUE)))nEach of these formulas can be copy/pasted into a worksheet for testing.nFor more on SortPlus, see https://excelribbon.tips.net/T012575nand https://sites.google.com/view/MyExcelToolbox


This Site

Got a version of Excel that uses the ribbon interface (Excel 2007 or later)? This site is for you! If you use an earlier version of Excel, visit our ExcelTips site focusing on the menu interface.

Newest Tips
Subscribe

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

(Your e-mail address is not shared with anyone, ever.)

View the most recent newsletter.