Hey gang! So here’s what I’ve been cooking up… I was trying to figure out a way to generate a report of all (almost all) links within a SharePoint web application so that I could analyze them for potential broken links come a migration project. Details and the PowerShell script are below. Or you can download the script here.
Scenario:
In preps for a SharePoint 2007 to SharePoint 2010 migration project, I wanted to do a prescan of all the links in the:
– Quick Launch
– Top/Global Navigation
– Links Lists
Why? Well, since we have an environment with a bajillion collaborative Site Collections, we all know it’s rare for an end-user to use relative links when manually creating links to things within the SharePoint sites. More often than not, absolute links will be used which could be quite a quandary in a migration situation in which URLs can change for any number of reasons. With this report, I could scan through it very quickly and notify site owners about links that may need to be updated in the future.
Solution Synopsis:
Loop through every SharePoint Site and Site Collection in a web application, every link in the Quick Launch menus of each site, every Top/Global Nav link of each site, and every Links list of each site. Take all that and dump into a spreadsheet. Visually scan spreadsheet for suspect links. This script will work with both publishing and non-publishing sites. *Have not tested on SharePoint 2010 environments.
*Modified script 5/23/2011 to also report all links starting with “http://” contained within the body of the default page for each SPWeb object. This is useful in case end-users manually put links into Content Editor Web Parts or Page Content Controls that link back to site content using absolute links. Thanks to the Hey, Scripting Guy! Blog for the inspiration.
Sample Output:
Click here for a sample output CSV.
PowerShell Script (feel free to leave constructive feedback if you see something that could be done differently/better!):
# This script will print out a list of all links in the Quick Launch, Links Lists, and default # page for each SPWeb object to a text file located in the directory specified in the variables section. # # Author: Henry Ong ######################## Start Variables ######################## $siteURL = "http://alvhong2" #URL to any site in the web application. $filePath = "C:\PowerShellScripts\Links.csv" $PublishingFeatureGUID = "94c94ca6-b32f-4da9-a9e3-1f3d343d7ecb" ######################## End Variables ######################## if(Test-Path $filePath) { Remove-Item $filePath } Clear-Host [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint") [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint.Publishing") [System.Reflection.Assembly]::LoadWithPartialName("System.Net.WebClient") # Creates an object that represents an SPWeb's Title and URL function CreateNewWebObject { $linkObject = New-Object system.Object $linkObject | Add-Member -type NoteProperty -Name WebTitle -Value $web.Title $linkObject | Add-Member -type NoteProperty -Name WebURL -Value $web.URL return $linkObject } # Creates an object that represents the header links of the Quick Launch function CreateNewLinkHeaderObject { $linkObject = New-Object system.Object $linkObject | Add-Member -type NoteProperty -Name WebTitle -Value $prevWebTitle $linkObject | Add-Member -type NoteProperty -Name WebURL -Value $prevWebURL $linkObject | Add-Member -type NoteProperty -Name QLHeaderTitle -Value $node.Title $linkObject | Add-Member -type NoteProperty -Name QLHeaderLink -Value $node.Url return $linkObject } # Creates an object that represents to the links in the Top Link bar function CreateNewTopLinkObject { $linkObject = New-Object system.Object $linkObject | Add-Member -type NoteProperty -Name WebTitle -Value $prevWebTitle $linkObject | Add-Member -type NoteProperty -Name WebURL -Value $prevWebURL $linkObject | Add-Member -type NoteProperty -Name TopLinkTitle -Value $node.Title $linkObject | Add-Member -type NoteProperty -Name TopLinkURL -Value $node.Url $linkObject | Add-Member -type NoteProperty -Name TopNavLink -Value $true return $linkObject } # Creates an object that represents the links of in the Quick Launch (underneath the headers) function CreateNewLinkChildObject { $linkObject = New-Object system.Object $linkObject | Add-Member -type NoteProperty -Name WebTitle -Value $prevWebTitle $linkObject | Add-Member -type NoteProperty -Name WebURL -Value $prevWebURL $linkObject | Add-Member -type NoteProperty -Name QLHeaderTitle -Value $prevHeaderTitle $linkObject | Add-Member -type NoteProperty -Name QLHeaderLink -Value $prevHeaderLink $linkObject | Add-Member -type NoteProperty -Name QLChildLinkTitle -Value $childNode.Title $linkObject | Add-Member -type NoteProperty -Name QLChildLink -Value $childNode.URL return $linkObject } ## Creates an object that represents items in a Links list. function CreateNewLinkItemObject { $linkObject = New-Object system.Object $linkObject | Add-Member -type NoteProperty -Name WebTitle -Value $prevWebTitle $linkObject | Add-Member -type NoteProperty -Name WebURL -Value $prevWebURL $linkObject | Add-Member -type NoteProperty -Name ListName -Value $list.Title $spFieldURLValue = New-Object microsoft.sharepoint.spfieldurlvalue($item["URL"]) $linkObject | Add-Member -type NoteProperty -Name ItemTitle -Value $spFieldURLValue.Description $linkObject | Add-Member -type NoteProperty -Name ItemURL -Value $spFieldURLValue.Url return $linkObject } # Determines whether or not the passed in Feature is activated on the site or not. function FeatureIsActivated {param($FeatureID, $Web) return $web.Features[$FeatureID] -ne $null } # Creates an object that represents a link within the body of a content page. function CreateNewPageContentLinkObject { $linkObject = New-Object system.Object $linkObject | Add-Member -type NoteProperty -Name WebTitle -Value $prevWebTitle $linkObject | Add-Member -type NoteProperty -Name WebURL -Value $prevWebURL $linkObject | Add-Member -type NoteProperty -Name PageContentLink -Value $link return $linkObject } $wc = New-Object System.Net.WebClient $wc.UseDefaultCredentials = $true $pattern = "(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)" $site = new-object microsoft.sharepoint.spsite($siteURL) $webApp = $site.webapplication $allSites = $webApp.sites $customLinkObjects =@()foreach ($site in $allSites) { $allWebs = $site.AllWebs foreach ($web in $allWebs) { ## If the web has the publishing feature turned OFF, use this method if((FeatureIsActivated $PublishingFeatureGUID $web) -ne $true) { $quickLaunch = $web.Navigation.QuickLaunch $customLinkObject = CreateNewWebObject $customLinkObjects += $customLinkObject $prevWebTitle = $customLinkObject.WebTitle $prevWebURL = $customLinkObject.WebURL # First level of the Quick Launch (Headers) foreach ($node in $quickLaunch) { $customLinkObject = CreateNewLinkHeaderObject $customLinkObjects += $customLinkObject $prevHeaderTitle = $node.Title $prevHeaderLink = $node.Url # Second level of the Quick Launch (Links) foreach ($childNode in $node.Children) { $customLinkObject = CreateNewLinkChildObject $customLinkObjects += $customLinkObject } } # Get all the links in the Top Link bar $topLinks = $web.Navigation.TopNavigationBar foreach ($node in $topLinks) { $customLinkObject = CreateNewTopLinkObject $customLinkObjects += $customLinkObject $prevHeaderTitle = $node.Title $prevHeaderLink = $node.Url } } ## If the web has the publishing feature turned ON, use this method else { $publishingWeb = [Microsoft.SharePoint.Publishing.PublishingWeb]::GetPublishingWeb($web) $quickLaunch = $publishingWeb.CurrentNavigationNodes $customLinkObject = CreateNewWebObject $customLinkObjects += $customLinkObject $prevWebTitle = $customLinkObject.WebTitle $prevWebURL = $customLinkObject.WebURL # First level of the Quick Launch (Headers) foreach ($node in $quickLaunch) { $customLinkObject = CreateNewLinkHeaderObject $customLinkObjects += $customLinkObject $prevHeaderTitle = $node.Title $prevHeaderLink = $node.Url # Second level of the Quick Launch (Links) foreach ($childNode in $node.Children) { $customLinkObject = CreateNewLinkChildObject $customLinkObjects += $customLinkObject } } # Get all the links in the Top Link bar $topLinks = $web.Navigation.TopNavigationBar foreach ($node in $topLinks) { $customLinkObject = CreateNewTopLinkObject $customLinkObjects += $customLinkObject $prevHeaderTitle = $node.Title $prevHeaderLink = $node.Url } } #Looking for lists of type Links $lists = $web.Lists foreach ($list in $lists) { if($list.BaseTemplate -eq "Links") { $prevWebTitle = $customLinkObject.WebTitle $prevWebURL = $customLinkObject.WebURL # Going through all the links in a Links List foreach ($item in $list.Items) { $customLinkObject = CreateNewLinkItemObject $customLinkObjects += $customLinkObject } Write-Host $list.Title } } #Looking at the default page for each web for links embedded within the content areas $htmlContent = $wc.DownloadString($web.URL) $result = $htmlContent | Select-String -Pattern $pattern -AllMatches $links = $result.Matches | ForEach-Object {$_.Groups[1].Value} foreach ($link in $links) { $customLinkObject = CreateNewPageContentLinkObject $customLinkObjects += $customLinkObject } Write-Host $web.Title $web.Dispose() } $site.dispose() } # Exporting the data to a CSV file $customLinkObjects | Select-Object WebTitle,WebURL,TopNavLink,TopLinkTitle,TopLinkURL,QLHeaderTitle,QLHeaderLink,QLChildLinkTitle,QLChildLink,ListName,ItemTitle,ItemURL,PageContentLink | Export-Csv $filePath write-host "Done"
Is it possible to filter these results to filter for a specific url? We have a massive site farm and we need to find if there any instances of a specific url. As it is right now (thanks to your awesome script by the way) I can take the results and filter through them in NotePad++. Is there a way to only grab the result when an instance of URL ex.example.com pops up? Thanks again!
First of all, thank you for sharing this script. I’ve already used it to audit our WSS 3.0 site in preparation for a migration to 2010 and it helped me track down lots of absolute links that would have been broken with a new domain name. I surprised how quickly it ran!
I’ve got a few questions/comments…
1) I tried downloading the script from your link at the top of this page, and also copying the script directly off the page. In both cases, I got error messages when running it. Despite knowing nothing about PowerScript, I was easily able to locate the errors and correct them, so it wasn’t an issue, but I thought I’d mention it.
2) What is the meaning of the PageContentLink column? I can’t work out what it’s telling me. For example, the root level site has a whole load of PageContentLinks showing “http://our-wss-site/_layouts/listedit.aspx?List=”.
3) Am I right in thinking that only CEWPs in the default page of the site will be checked? CEWPs on non-default pages, forms etc. won’t be checked? How about pages created/modified in SharePoint Designer with Data View Web Parts? Is there a way to check for links in these as well?
Once again, thank you for the script!
Hi Mark,
I appreciate the comments. To answer your questions:
1. What was the error message and what did you do to correct it?
2. It’s been awhile, but if I remember correctly, the PageContentLink column will show any links found in the default home page of a SharePoint web site (SPWeb) that wasn’t already found in the Quick Launch, Top Nav Bar or Links List.
3. Correct, this script, as is, will only check the contents of the home/default page of the SPWeb. It should be able to identify links within pages that were created/modified in SharePoint Designer as well.
Great post thank you as it helped us in our SharePoint 2010 farm. We now just built a SharePoint 2010 environment and found another free tool that also works well, its at http://www.qipoint.com, great post and hope this helps someone else!
I have read so many articles concerning the blogger lovers however this
piece of writing is genuinely a pleasant piece of writing, keep it up.