Counting Atoms in a Chemical Formula

Written by Allen Wyatt (last updated January 1, 2024)
This tip applies to Excel 2007, 2010, 2013, 2016, 2019, and Excel in Microsoft 365


Ruby is trying to find an easy way to determine the number of atoms in molecular formulas of some chemical structures. For instance, a cell might contain a formula such as C12H10N6F2. In this case the number of atoms is 12 + 10 + 6 + 2 = 30. Ruby has about 300 of these formulas to do and was wondering if there is an Excel formula that can be used to do this.

First, the bad news: There is no easy way to do this.

There; with that out of the way, we can start to look for solutions. The example chemical formula provided by Ruby may lead some to think that counting atoms is a simple process of substituting the alphabetic characters with something else so that just the numeric characters can be evaluated. As an example, here is Ruby's example chemical formula:


If you replace the alphabetic characters with plus signs, you get this:


Looks like a simple formula now, right? This is deceiving, because while it will work in this instance, it may not work at all for Ruby's other chemical formulas. Consider the following chemical formula that many people will be familiar with:


Doing the same substitution renders this:


Problem is, there is an implied count of 1 whenever there is a single element—for example, the oxygen element. Thus, H20 is actually 3 atoms.

So now we can come up with a way to simply account for the implied 1, right? Sure; this can be done. It can be done most easily and cleanly with a macro, such as the following user-defined function:

Function CountAtoms(ChemForm As String)
    Dim sNewNum As String
    Dim sTemp As String
    Dim iNewAtoms As Integer
    Dim iTotalAtoms As Integer
    Dim J As Integer

    sNewNum = ""
    iTotalAtoms = 0

    For J = 2 To Len(ChemForm)
        sTemp = Mid(ChemForm, J, 1)
        If sTemp >= "0" And sTemp <= "9" Then
            sNewNum = sNewNum & sTemp
        ElseIf sTemp <= "Z" Then
            iNewAtoms = Val(sNewNum)
            If iNewAtoms = 0 Then iNewAtoms = 1
            iTotalAtoms = iTotalAtoms + iNewAtoms
            sNewNum = ""
        End If
    Next J

    iNewAtoms = Val(sNewNum)
    If iNewAtoms = 0 Then iNewAtoms = 1
    iTotalAtoms = iTotalAtoms + iNewAtoms

    CountAtoms = iTotalAtoms
End Function

In order to use this function in your worksheet, you would simply reference the chemical formula:


If the chemical formula is in cell A1, this function returns the count you desire. It will even work with formulas such as the following:


Note that these rely on two-character element names, of which there are many. It does require, however, that the second character of a two-character element name not be capitalized.

So, will this approach work with all chemical formulas? Not really; it only works with the simple ones we've covered so far. You see, chemical formulas can get quite complex. Consider the following example:


When an initial number appears like this, then the formula is to be multiplied by that value. Thus, instead of the normal 3 atoms in H2O, this formula would have 6 atoms.

It gets worse. Consider the following valid chemical formulas:


Note the parentheses followed by a number. In this nomenclature, the value immediately following the closing parenthesis indicate how many of the molecules within the parentheses are in the larger molecule. Thus, in the second example there are 3 molecules of SO4 and 18 molecules of H2O in the overall molecule. This obviously affects the number of atoms in the entire formula. To compound complexity, parentheses can even be nested:


Fun, huh?

This can still be addressed with a more complex macro. Rather than reinvent the wheel here, though, if you are working with complex chemical formulas such as these, you might want to consider using the macros provided at this site:

Note that the macros aren't implemented as user-defined functions. To use them you simply select the cells with the formulas, run the macro, and then the macros modify information in the columns to the right of the selected chemical formulas. Full instructions are included with the code at the above website.

You'll also need to make sure you enable, in the Visual Basic Editor, regular expressions. You do this by choosing Tools | References and then scrolling through the available references to locate the Microsoft VBScript Regular Expressions 5.5 option. Make sure the check box to the left of the reference is selected, then click OK.


If you would like to know how to use the macros described on this page (or on any other page on the ExcelTips sites), I've prepared a special page that includes helpful information. Click here to open that special page in a new browser tab.

ExcelTips is your source for cost-effective Microsoft Excel training. This tip (13707) applies to Microsoft Excel 2007, 2010, 2013, 2016, 2019, and Excel in Microsoft 365.

Author Bio

Allen Wyatt

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He is president of Sharon Parq Associates, a computer and publishing services company. ...


Using the COLUMN Function

Need to know the column number for use in a formula? The worksheet function you want is the COLUMN function, described in ...

Discover More

Correctly Repeated Words

There are times when you need to repeat a word in a document, but doing so triggers an "error reaction" from Word's ...

Discover More

Excel Serious Sorting

Sorting data means that you organize it in whatever order you desire. Excel's sorting feature can be used in a variety of ...

Discover More

Professional Development Guidance! Four world-class developers offer start-to-finish guidance for building powerful, robust, and secure applications with Excel. The authors show how to consistently make the right design decisions and make the most of Excel's powerful features. Check out Professional Excel Development today!

More ExcelTips (ribbon)

Aborting a Macro and Retaining Control

If you need to exit a macro before it is finished running, you can do it using a brute force method, or you can build in ...

Discover More

Setting Program Window Size in a Macro

The macro programming language used in Excel gives you a great many tools that allow you to modify the way that Excel ...

Discover More

Storing a User's Location before Running a Macro

Macros are often used to process information in a workbook. If your macro makes changes in what is selected in the ...

Discover More

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

View most recent newsletter.


If you would like to add an image to your comment (not an avatar, but an image to help in making the point of your comment), include the characters [{fig}] (all 7 characters, in the sequence shown) in your comment text. You’ll be prompted to upload your image when you submit the comment. Maximum image size is 6Mpixels. Images larger than 600px wide or 1000px tall will be reduced. Up to three images may be included in a comment. All images are subject to review. Commenting privileges may be curtailed if inappropriate images are posted.

What is five minus 0?

2024-01-03 01:52:26


Tangentially to this issue I have a tip for calculating molar mass of molecules.
I have created an Excel template that has defined names for essentially all common chemical elements, but instead of assigning ranges to those names I assigned explicit values of atomic weights, with 2 or 3 decimals. This way they do not clutter any worksheet. Alternatively, you could use a hidden worksheet for that.

I could assign range names using element symbols for all elements except for C (why C is not accepted as a valid range name is a mystery for me). For this I used "C_".
Now I can easily calculate molecular weights of compounds without having to memorize or look up atomic masses, e.g.,
for CuSO4·5H2O i write =Cu+S+O*4+5*(H*2+O)
For C6H6O12 -----> =C_*6+H*6+O*12
The range names are not case sensitive so when typing you do not have to pay attention to it; I used proper capitalization for the range names though, so once entered Excel will convert to proper element symbols in the formulas.
I could find that template (I am retired now, so I do not use it often) and send it to anyone interested.

2024-01-03 01:16:31

Tomek the Mad Scientist

Chemical formulas were first invented in the beginning of 19th century, and although they changed a bit from original proposed by Berzelius, once they were agreed upon they remained mostly unchanged. What it implies is that they are meant for human interpretation, and are not easily entered into present versions of Office programs. Chemistry, a descendant of alchemistry, was always an arcane knowledge, so possibly there was no desire to make it more understandable by a common person, so formulas stayed somewhat cryptic. In particular, the formulas use subscripts to indicate the count of a preceding atom or group, brackets to group elements in to groups, superscripted numbers with superscripted +/- to indicate ion charge. Unscripted numbers before a molecule or its part indicate that whatever group/molecule follows is to be taken as multiple.

Although you can format a chemical formula in an Excel cell to display both subscripts and superscripts, it is very tedious and most people just don't bother. Such formatting could possibly allow for better parsing by some macro, but creating such a program would be still a full blown project, The link given by Allen has a function that converts all digits in the formula to subscripts, however it works properly on some formulas. The formula in the tip for aluminum sulphate octadecahydrate Al2(SO4)3(H2O)18 should actually be written as Al2(SO4)3·18H2O (see Figure 1 below) gets all digits converted to subscripts, but 18 should stay normal size. The formulas that count atoms and provide simplified formulas also do not handle formulas similar to that last one.
I think it is high time for IUPAC or similar body to consider significant revision of the way chemical formulas are written, to make it compatible with computer programs, or do we want to wait for AI to make even more mess of this?

Figure 1. 

2024-01-03 01:15:04

Tomek the Mad Scientist

@Leslie Glasser:

Whether SO4 is a group or a molecule is irrelevant to the problem discussed here. One way or the other some formulas will not represent an actual molecule. There are some organometallic compounds that exist only as dimers, but still the formulas for them are rarely written to reflect this.

2019-11-30 17:41:42

Leslie Glasser

Please note: The statement "Thus, in the second example there are 3 molecules of SO4 and 18 molecules of H2O in the overall molecule." is chemically incorrect. SO4 is not a molecule, rather call it "a group of atoms". Similarly, the overall formula does not represent a molecule but rather an "empirical formula".

2019-11-30 14:54:12


Numerical items in a molecule formula sometimes don’t relate to the number of atoms but to the relative position of the bond in the molecule ... simply counting them (even taking into account all the considerations mentioned above) still won’t give a robust solution ...

This Site

Got a version of Excel that uses the ribbon interface (Excel 2007 or later)? This site is for you! If you use an earlier version of Excel, visit our ExcelTips site focusing on the menu interface.

Newest Tips

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

(Your e-mail address is not shared with anyone, ever.)

View the most recent newsletter.