Pulling Apart a URL

by Allen Wyatt
(last updated November 22, 2014)

7

Jordan has a need to break apart a URL into its components. If the full URL (http://www.xyz.com/business) is in column A, he would like a way to have the main domain (xyz or xyz.com) in column B and the path (business) in column C.

There are multiple ways you can go about solving this issue. If you only need to do this a time or two, it might be easiest to use the Text to Columns tool, available on the Data tab of the ribbon. When you specify how you want the tool to pull the strings apart, indicate that you want it to use the slash (/) as a delimiter. Depending on the way your original URLs are put together, this may require a bit of "cleanup" after the tool is done, but it is quick and easy.

If you do this more often, then you might benefit from using a formula to do the manipulation of the URL. The first task is to pull the domain from the URL, but this is a bit tricker than it sounds. For instance, the URL may start with any number of protocol specifiers (such as http://, https://, ftp://, etc.) or it may not begin with a protocol specifier at all. Plus, you may want to strip off the "www." nomenclature, as some people consider it extraneous. To figure out where the "real" domain starts, you could use this formula in column B:

=IFERROR(FIND("www.",$A1)+4,IFERROR(FIND("://",$A1,1)+3,1))

What it does it to return the position of the first character after "www.", if it is available. If it is not present, then it returns the position of the first character after "://", if it is present. If that is not there, then the number 1 is returned, since the URL doesn't contain any protocol prefix or www.

With that value in column B, you can then start to actually pull out the parts you want. The following, placed, in column C, will return the domain, as already described. It essentially returns everything from the character position shown in column B up to the next slash (/).

=MID(A1,B1,IFERROR(FIND("/",A1,B1)-B1,LEN(A1))

If the original url included "www.xyz.com," then "xyz.com" is returned by the formula. If it contained a different subdomain other than "www" (such as "research.xyz.com"), then the full domain including the subdomain is returned. The IFERROR function is included in case there is no slash (/) after the domain name. (The FIND function returns a #VALUE error if it cannot find the item for which it is looking.)

To pick off the path after the domain, you can use the following formula in column D:

=IFERROR(MID(A1,FIND("/",A1,B1)+1,LEN(A1)),"")

The formula returns everything after the first slash (/) occurring after the domain name. If there is no slash after the domain name, then it returns nothing. (This is, again, thanks to the use of the IFERROR function.)

The key to using the above formulas, again, is the helper column in column B. If you wanted to do away with the need for column B, you would need to replace in the other two formulas all instances of B1 with the formula (without the equal sign) that belongs in B1. That would, of course, make the other two formulas quite long, especiall the one noted for column C.

You can, if desired, create a macro that essentially does the same thing, without the need for a helper column. The following example expect you to make a selection and then pulls the parts of the URL out of the cells and places them into the two columns to the right of the selection.

Sub GetURLParts()
    Dim c As Range
    Dim sRaw As String
    Dim J As Integer
    
    For Each c In Selection
        sRaw = c.Text
        J = InStr(sRaw, "://")
        If J > 0 Then sRaw = Mid(sRaw, J + 3)
        If LCase(Left(sRaw, 4)) = "www." Then
            sRaw = Mid(sRaw, 5)
        End If
        J = InStr(sRaw, "/")
        If J > 0 Then
            c.Offset(0, 1) = Left(sRaw, J - 1)
            c.Offset(0, 2) = Mid(sRaw, J + 1)
        Else
            c.Offset(0, 1) = sRaw
            c.Offset(0, 2) = ""
        End If
    Next c
End Sub

ExcelTips is your source for cost-effective Microsoft Excel training. This tip (13320) applies to Microsoft Excel 2007, 2010, and 2013.

Author Bio

Allen Wyatt

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He  is president of Sharon Parq Associates, a computer and publishing services company. ...

MORE FROM ALLEN

Filling References to Another Workbook

When you create references to cells in other workbooks, Excel, by default, makes the references absolute. This makes it ...

Discover More

Offering Options in a Macro

When creating macros, you often need to offer a series of choices to a user. This tip demonstrates how easy it is to offer ...

Discover More

Copying Headers and Footers

Need to copy headers and footers from one worksheet to another? How about from one workbook to another? Here are some ...

Discover More

Create Custom Apps with VBA! Discover how to extend the capabilities of Office 2013 (Word, Excel, PowerPoint, Outlook, and Access) with VBA programming, using it for writing macros, automating Office applications, and creating custom applications. Check out Mastering VBA for Office 2013 today!

More ExcelTips (ribbon)

Dynamic Hyperlinks in Excel

Hyperlinks to many types of Web sites rely on passing parameters in the URL. Knowing this, you can construct a dynamic ...

Discover More

Special Characters In Hyperlinks

Do you use special characters (such as the pound sign) in your worksheet names? If so, you could run into problems creating ...

Discover More

Using Drag-and-Drop to Create a Hyperlink

If you open workbooks in two instances of Excel, you can use drag-and-drop techniques to create hyperlinks from one workbook ...

Discover More
Subscribe

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

View most recent newsletter.

Comments

If you would like to add an image to your comment (not an avatar, but an image to help in making the point of your comment), include the characters [{fig}] in your comment text. You’ll be prompted to upload your image when you submit the comment. Maximum image size is 6Mpixels. Images larger than 600px wide or 1000px tall will be reduced. Up to three images may be included in a comment. All images are subject to review. Commenting privileges may be curtailed if inappropriate images are posted.

What is one more than 9?

2014-11-25 10:50:37

Jordan

A huge thank-you to Allen and everyone that contributed to this answer. Wishing you and all of your families a very happy Thanksgiving.


2014-11-25 08:02:06

Jim

four things, find in the URL the position of:
1. "//"
2. ".com"
3. next "/" after ".com" (if there's no "/", use #2)
4. LEN of url

now lets play with these stuffs using MID formula. between "//" and ".com" was the parent domain.


2014-11-24 09:55:20

Michael (Micky) Avidan

@Allen,
Although I'm not hasty - I usually refer to the examples you present in your tip.
Do me and yourself a favour and present such a list (of which, I presume, you checked your suggested formulas) right from the beginning(!)
Michael (Micky) Avidan
“Microsoft® Answers" - Wiki author & Forums Moderator
“Microsoft®” MVP – Excel (2009-2015)
ISRAEL


2014-11-23 07:11:54

barouh

There is one more often task for URL analysis - and I don;t know easy solution for it. Will be glad to hear some recipes for it

I'm talking about the task to convert full domain name to 2nd level domain (i.e. all cases sub.domain.com, www.domain.com, domain.com/smth should be converted to domain.com)
It's pretty easy if the dataset includes only domains of 2nd and 3rd level - we can define domain level by formula LEN(A1)-LEN(substitute(A1,".";"")+1, and then for 3rd level domain we extract the text after first "." by combination of MID and FIND
But what if the dataset includes domains of 4th, 5th, 6th level and more? (in real data I met even 9th and 10th level domains)


2014-11-22 16:52:06

Dawit

The best tool for this task is- Regular expression. If you goggle it, there are hundreds of free codes.


2014-11-22 09:48:11

awyatt

Micky: There are many instances where your approach won't work. Here are examples:

http://xyz.com
http://xyz.com/
http://xyz.com/business
http://xyz.com/business/
http://xyz.com/business/more
http://xyz.com/business/more/
http://www.xyz.com/business/
http://www.xyz.com/business/more
http://www.xyz.com/business/more/
http://business.xyz.com
http://business.xyz.com/
http://business.xyz.com/newstuff
http://business.xyz.com/newstuff/
http://business.xyz.com/newstuff/more
http://business.xyz.com/newstuff/more/

The approaches I published worked in all of these. Yours returns either #VALUE! errors in column B or the wrong values in columns B and C.

-Allen


2014-11-22 07:46:06

Michael (Micky) Avidan

To my opinion - too long and too complicated.
From what I have just tested - it can be shorten a lot.
* Starting with cell C1 - type:
=TRIM(RIGHT(SUBSTITUTE(A1,"/",REPT(" ",256)),256))
* In cell B1 type:
=RIGHT(SUBSTITUTE(A1,"/"&C1,),LEN(SUBSTITUTE(A1,"/"&C1,))-FIND(".",SUBSTITUTE(A1,"/"&C1,)))
Et Voilà!
Michael (Micky) Avidan
“Microsoft® Answers" - Wiki author & Forums Moderator
“Microsoft®” MVP – Excel (2009-2015)
ISRAEL


This Site

Got a version of Excel that uses the ribbon interface (Excel 2007 or later)? This site is for you! If you use an earlier version of Excel, visit our ExcelTips site focusing on the menu interface.

Newest Tips
Subscribe

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

(Your e-mail address is not shared with anyone, ever.)

View the most recent newsletter.