Script to recrawl all crawl errors

I’m currently working on a document management project, which holds 10+ million documents. During a full crawl we had a temporary network issue, which resulted in 340.000 crawl errors. I didn’t want to do a new full crawl again, since the full crawl did finish with all documents. Instead, I want those items to be picked up in the next incremental crawl. Using Central Administration you can select the option “Recrawl the item in the next crawl” for each item which caused an error, but I obviously didn’t want to manually select this option for all errors.

To automate this, I’ve created a PowerShell script which can list the errors, but can also mark all errors automatically for the recrawl. The explanation of the script can be found in the comments of the script.

#——————————————————————————
# Provide parameters
#——————————————————————————
param (
   # Name of the search service application is mandatory
   [string] $SearchServiceApplicationName = $(throw “Please specify a search service application”),
  # By default, use all available content sources
   [string] $ContentSourceName = “”,
   # By default only a list of the errors is shown
   [switch] $RecrawlErrors = $false
)

#——————————————————————————
# Ensure the SharePoint PowerShell Snapin is loaded
#——————————————————————————
if ((Get-PSSnapin “Microsoft.SharePoint.PowerShell” -ErrorAction SilentlyContinue) -eq $null) {
    Add-PSSnapin “Microsoft.SharePoint.PowerShell”
}

#——————————————————————————
# Set some constant values
#——————————————————————————
# The id of the error stating a document will be processed in the next crawl
[int] $errorIdRetryNextCrawl = 437
# The number of documents which should be retrieved per batch from the ssa
[int] $batchSize = 1000
# 2 stands for Errors
[int] $errorLevel = 2

#——————————————————————————
# Retrieve the seach service application and crawl log
#——————————————————————————
$ssa = Get-SPEnterpriseSearchServiceApplication -Identity $SearchServiceApplicationName
$crawlLog = New-Object Microsoft.Office.Server.Search.Administration.CrawlLog $ssa

#——————————————————————————
# Retrieve the content source for which the errors should be loaded
#——————————————————————————
# Default use all content sources
[int] $contentSourceId = -1

# If a content source is provided, determine the ID
if([string]::IsNullOrEmpty($ContentSourceName) -eq $false) {
    write-host “Retrieving content source with the name $ContentSourceName… ” -NoNewline
    $contentSource = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $ContentSourceName -ErrorAction SilentlyContinue

    if($contentSource -eq $null) {
       write-host “Invalid content source provided” -ForegroundColor Red
      return
   }
    else {
       $contentSourceId = $contentSource.Id
write-host “Content source found” -ForegroundColor Green
    }
}
else {
   write-host “No content source provided, all available content sources will be used”-ForegroundColor Yellow
}

#——————————————————————————
# Process the crawl errors per error type
#——————————————————————————
write-host “”
write-host “Checking errors from the crawl log”
$crawlLog.GetCrawlErrors($contentSourceId, 1) | ForEach-Object {
    write-host ([string]::Format(“- {0}: {1}”, $_.ErrorCount, $_.ErrorMessage ))

    # Enable recrawl of errors for all errors except the recrawl on next crawl error
    if($RecrawlErrors -and $_.ErrorID -ne $errorIdRetryNextCrawl) {
       write-host “`t- Marking the errors for recrawl on next crawl”
# Get the first batch
       $processedItems = 0
       $errors = $crawlLog.GetCrawledUrls($false, $batchSize, “”, $true, $contentSourceId, $errorLevel, $_.ErrorID, [datetime]::MinValue,[datetime]::MaxValue)

       DO{
          write-host ([string]::Format(“`t`t – Processing batch {0}/{1}… “, $processedItems, $processedItems + $batchSize)) -NoNewline
   
# Recrawl the errors
          $errors | ForEach-Object {
          $crawlLog.RecrawlDocument($_.FullUrl) | Out-Null
       }
       write-host “done” -ForegroundColor Green
       $processedItems += $batchSize

       # Get the next batch
   $errors = $crawlLog.GetCrawledUrls($false, $batchSize, “”, $true, $contentSourceId, $errorLevel, $_.ErrorID, [datetime]::MinValue,[datetime]::MaxValue)

    }
    while ($errors -ne $null)
       write-host “”
    }
}

Open PDF documents in your client application

July 2, 2015 1 comment

When working with PDF files in SharePoint, most of the time these have to be opened in the browser and most of the time that works. That is because an Adobe plugin within your browser checks if the file which is returned has a content type of application/pdf and if so, it opens the document within your browser.

For one of my customers this was not what they wanted. The wanted the option to open multiple PDF documents and show them next to each other on their screen (not in different browser windows). You can disable the Adobe browser plugin, but this will impact all PDF’s you download, also from other sources (Internet etc.)

Okey, so how did we fixed this. The most important part is that the Adobe plugin checks the content type (the MIME type) returned. So to start, we need to change this one… Perform these steps on ALL of your front-end servers:

  1. Open Internet Information Services.
  2. On the GLOBAL level, navigate to the MIME types.
    The PublishingHttpModule handles the authorize request and also looks up the MIME type, this will be done based on the global settings. This means this method will work for all web applications within your farm, if you want it or not.
  3. Find the MIME type for the extension pdf.
  4. Change the MIME type to application/pdf2.
  5. Perform an IISRESET.

These actions will prevent the PDF document to open in the browser, but instead it will show you the save dialog (which I don’t want):PDF Download Dialog
Note: This dialog will only be shown for documents which haven’t been opened before. For other documents, you will have to clear your local browser cache!

If you want the document to open automatically within the Adobe Reader or Writer, or whichever application you use to open PDF files, this can be achieved by updating your registry. This can of course also be set by a policy for your entire company. The following steps can be made to update your registry to automatically open your PDF application:

  • Start regedit on your client (or create a policy)
  • Navigate to HKEY_CURRENT_USER\Software\Microsoft\Windows\Shell
  • Navigate to AttachmentExecute or create this key if it doesn’t exist yet.
  • Navigate to {0002DF01-0000-0000-C000-000000000046} or create this key if it doesn’t exist yet.
  • Create a new Binary Value with the name AcroExch.Document.11

Note: The value of the name of the binary value can differ with your company. The AcroExch.Document.11 is used for the Adobe Acrobat Reader. To check your value open your command prompt and execute the command assoc.[extension], so in this case assoc.pdf

That’s it! All PDF documents stored within your SharePoint environment will now be opened within your client application.

Categories: Environment, SharePoint

Cross-domain errors with SharePoint Apps

When building SharePoint Apps, JavaScript can be used to communicate with your SharePoint environment. Lately I’ve got a couple of questions about how this works with CORS (Cross-Origin Resource Sharing).

The problem people faced was that SharePoint was hosted on an URL like https://mytenant.sharepoint.com while the app itself was hosted on an URL like https://myapp.whatever.com. While developing Apps for SharePoint it’s a common and best practice to use totally different domains for security purposes (app isolation).

Within SharePoint there is something called the Cross-Domain library. This is not a document library within SharePoint, but a JavaScript file (SP.RequestExecutor.js) which contains files that allow you to perform CRUD operations within SharePoint from a different domain. It basically works as a proxy.

There are plenty of examples on how this library can be used, for example the article Access SharePoint 2013 data from apps using the cross-domain library on MSDN, but they still had issues getting it to work cross-domain.

The problem is that a lot of companies have their SharePoint URL as a Trusted Site or Local Intranet zone within their browser settings, but not the URL where the app is hosted. The cross-domain calls can only work if BOTH URL’s are added to the same zone! Or… not added at all. It will not work when placed in different security zones…

Categories: SharePoint

Forcefully delete site collection

August 21, 2014 9 comments

Today I found a site collection on a customer environment which gave a completely blank page when you opened it via a browser. It didn’t gave a 404 (Not Found) error, it was just a blank page. I decided to figure out what was happening and found that during the creation of the site collection, an IISRESET had taken place. Because of this, the site wasn’t completely provisioned. Well, if it wasn’t completely provisioned, I don’t need it… Nobody could have added content.

I found out that I couldn’t remove the site using Central Administration. When you navigate to the site collection using the “Delete a site collection” page, the details (right hand site of the page) where not loaded and you cannot select the site collection. So… I wanted to delete the site using PowerShell, but this gave me an error:

PS C:\Users\macaw> remove-spsite http://dms/case/P68430
Confirm
Are you sure you want to perform this action?
Performing the operation “Remove-SPSite” on target “http://dms/case/P68430“.
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help (default is “Y”): Y
remove-spsite : <nativehr>0x80070003</nativehr><nativestack></nativestack>
At line:1 char:1
+ remove-spsite http://dms/case/P68430
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (Microsoft.Share…mdletRemoveSite:SPCmdletRemoveSite) [Remove-SPSite], DirectoryNotFoundException
+ FullyQualifiedErrorId : Microsoft.SharePoint.PowerShell.SPCmdletRemoveSite

Apparently, the normal remove-spsite cmdlet cannot delete a site collection which is not fully provisioned, and this cmdlet doesn’t have a force flag. To forcefully delete the site collection, I used the SPContentDatabase.ForceDeleteSite method:

$siteUrl =http://dms/case/P68430
$site = get-spsite $siteUrl
$siteId = $site.Id
$siteDatabase = $site.ContentDatabase
$siteDatabase.ForceDeleteSite($siteId, $false, $false)

Create lookup field using PowerShell and CSOM

May 19, 2014 4 comments

For our projects we always try to avoid manual configurations. This is because it is a tedious and error prone process if you work with a DTAP environment. To avoid this, we also try to script as much as possible for SharePoint Online projects. Lately we worked with creating lookup fields in SharePoint online, using PowerShell and CSOM. Creating fields this way is pretty easy, but connecting lookup fields forced us to think about casting the Microsoft.SharePoint.Client.Field object to a Microsoft.SharePoint.Client.FieldLookup object.

Within CSOM this can be done by leveraging the ClientRuntimeContext.CastTo method, but… This is a generic method (object of type T). This is something which is not easily supported by PowerShell. To use this method, you can use reflection using the MakeGenericMethod method.

The full PowerShell script is provided below

#————————————————————-
# LOAD CLIENT ASSEMBLIES
#————————————————————-
$clientAssembliesFolder = “D:\ClientAssemblies”
Add-Type -Path (Join-Path -Path $clientAssembliesFolder -ChildPath “Microsoft.SharePoint.Client.dll”)
Add-Type -Path (Join-Path -Path $clientAssembliesFolder -ChildPath “Microsoft.SharePoint.Client.Runtime.dll”)

#————————————————————-
# INITIALIZE CONTEXT
#————————————————————-
[string]$siteUrl = "https://[UseYourOwn].sharepoint.com/sites/Dev"
[string]$username = “admin@[UseYourOwn].onmicrosoft.com”
[string]$password = “[UseYourOwn]”
$pwd = $password | ConvertTo-SecureString -AsPlainText -Force
$context = New-Object Microsoft.SharePoint.Client.ClientContext($siteUrl)
$credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($username, $pwd)
$context.Credentials = $credentials

#————————————————————-
# LOAD CASTTO FOR LOOKUPS
#————————————————————-
$castToMethodGeneric = [Microsoft.SharePoint.Client.ClientContext].GetMethod(“CastTo”)
$castToMethodLookup = $castToMethodGeneric.MakeGenericMethod([Microsoft.SharePoint.Client.FieldLookup])

#————————————————————-
# LOAD LISTS
#————————————————————-
[string] $originaListTitle = “List1”
[string] $destinationListTitle = “List2”
$listOriginal = $context.Web.Lists.GetByTitle($originaListTitle)
$context.Load($listOriginal)
$listDestination = $context.Web.Lists.GetByTitle($destinationListTitle)
$context.Load($listDestination)
$context.ExecuteQuery() # This loads the necessary list ID

#————————————————————-
# CREATE LOOKUP
#————————————————————-
[string] $internalName = “LookupWithStaticName”
[string] $displayName = “LookupTest”
[string] $displayFieldForLookup = “Title”
[string] $lookupFieldXML = “<Field DisplayName=`”$internalName`” Type=`”Lookup`” />”
$option = [Microsoft.SharePoint.Client.AddFieldOptions]::AddFieldToDefaultView

$newLookupField
= $listDestination.Fields.AddFieldAsXml($lookupFieldXML, $true, $option)
$context.Load($newLookupField)
$lookupField = $castToMethodLookup.Invoke($context, $newLookupField)
$lookupField.Title = $displayName
$lookupField.LookupList = $listOriginal.Id
$lookupField.LookupField = $displayFieldForLookup
$lookupField.Update()
$context.ExecuteQuery()

SharePoint 2013 warm-up script

For SharePoint On Premise platforms it’s a good practice to use a warm-up script to avoid long loading times in the morning. By default IIS recycles the web application pools every night to clean up the memory and this is a good practice. Todd Klindt written a nice post about using the Invoke-WebRequest cmdlet which is available in PowerShell v3 and how to use this as basis for your warm-up script.

I used it as a basis and created the script you find below. Important notes:

  • The script will load the start page of the root site collection of every web application.
  • Different types of web templates, use different assemblies. If you want to preload all assemblies, ensure you load the different types of sites. The additionalUrls array is used for that in the script.
  • When you use multiple front-end servers, you want schedule the script on all front-end servers. Also make sure the server doesn’t use a load balancer when you are on the server itself, you can do this by updating the hosts file.

#——————————————————
# Ensure the SharePoint Snappin has been loaded
#——————————————————
if ( (Get-PSSnapin -Name “Microsoft.SharePoint.PowerShell” -ErrorAction SilentlyContinue) -eq $null ) {
    Add-PSSnapin “Microsoft.SharePoint.PowerShell”
}

#——————————————————

# Simple method to write status code with a colour
#——————————————————
function Write-Status([Microsoft.PowerShell.Commands.WebResponseObject] $response) {
    $foregroundColor = “DarkRed”
    if($response.StatusCode -eq 200) {
        $foregroundColor = “DarkGreen”
    
}
    write-host ([string]::Format(“{0} (Status code: {1})”, $response.StatusDescription, $response.StatusCode)) -ForegroundColor $foregroundColor
}

#——————————————————
# Warm-up all web applications
#——————————————————
Get-SPWebApplication | ForEach-Object {
    
write-host ([string]::Format(“WebApplication request fired for {0} [{1}]… “, $_.DisplayName, $_.Url)) -NoNewline
    
Write-Status -response (Invoke-WebRequest $_.url -UseDefaultCredentials -UseBasicParsing)
}

#——————————————————
# Since the root of web applications use different templates then other site collections, also load other sites of different
# types. This ensures their assemblies also get loaded in memory
#——————————————————
$additionalUrls = @(http://developmentserver/sites/search&#8221;,
 http://developmentserver/site/teamsite&#8221;)
$additionalUrls | ForEach-Object {
    write-host ([string]::Format(“Additional web request fired for Url: {0}… “, $_)) -NoNewline
    
Write-Status -response (Invoke-WebRequest $_ -UseDefaultCredentials -UseBasicParsing)
}

 

 

Re-activating web features within web application

One of my projects is a huge SharePoint 2013 On-Premise platform with 200.000+ (sub) sites. I’ve created a custom web template to ensure all sites are created the same way, with the same settings. A web template works very well for these environments, but when you update the template, the changes will not be made in all existing sites. The web template will only be applied when creating new sites.

I will not throw away all sites when we have new updates to re-create the sites, but I will re-active certain features to ensure the updates are applied.

The script I’m using is as followed:

#———————————————————————————————————————
# Add SharePoint PowerShell Snapin 
#———————————————————————————————————————
if ( (Get-PSSnapin -Name Microsoft.SharePoint.PowerShell -ErrorAction SilentlyContinue) -eq $null ) {
    Add-PSSnapin Microsoft.SharePoint.Powershell

#———————————————————————————————————————
# Set variables 
#———————————————————————————————————————
$webApplicationUrl = http://veemssdev02&#8221;
$featureIds = @(“e4acfa03-b1e6-4eed-aeab-1bd17551aa59”,“Macaw.SP2013.Intranet.InSite_AddDefaultPage_Web”

#———————————————————————————————————————
# Reactivate features 
#———————————————————————————————————————
Get-SPWebApplication -Identity $webApplicationUrl | get-spsite -Limit all | get-spweb -Limit all | ForEach-Object {
    write-host ([string]::Format(“Testing web {0} [{1}]”, $_.Title, $_.Url))

    foreach($featureId in $featureIds) {
        $feature = $_.Features | where {$_.DefinitionId -eq $featureId -or $_.Definition.DisplayName -eq $featureId}
        if($feature -ne $null) {
            write-host ([string]::Format(“`t- Feature {0} ({1}) found. Re-enabling the feature.”, $feature.Definition.DisplayName, $feature.DefinitionId))
           write-host “`t`t- Disabling feature”
            Disable-SPFeature -Identity $featureId -Url $_.Url -Confirm:$false
            write-host “`t`t- Enabling feature”
            Enable-SPFeature -Identity $featureId -Url $_.Url -Confirm:$false -force
        
}
    
}
}

When you do not want to re-activate features, but want to enable new features, you can simply use the same code, but remove the feature check (if($feature
-ne $null)
) and the Disable-SPFeature.

Categories: PowerShell, SharePoint

Small spike in load times every minute

The problem

This week I was asked to help out one of our customers. They were looking into a performance issue of their SharePoint 2013 on-premise farm. Occasionally they experienced a spike in load times (which can be caused by a search index, backup operation etc.).

To get an indication how often the spike in load times occurred, they used a PowerShell script which loads the home page of their Intranet every 10 seconds and logs the load time. The results showed that the load time of the page was less than 100 milliseconds 5 times in a row and then it increased to 700 milliseconds. Even though this wasn’t the issue they were looking for (the spikes they were investigating were 7-10 seconds), this was something they couldn’t figure out. The 700 milliseconds load time itself isn’t really an issue, but not knowing what causes these spikes is.

The answer

Using the ULS log, we traced the 700 millisecond requests and found the actions which were taken at that time and compared those actions to the actions taken when the page loaded in less than 100 milliseconds. The additional actions comprised of 7 cross site queries which were performed. These queries were performed for loading the Content By Query WebParts on the home page. Just to check how much time those queries cost at the database layer, we traced the queries using the SQL Server Profiler. All these queries were pretty quickly and they only accessed information of a specific list.

The reason the Content By Query WebParts only load once a minute, is because the home page is a publishing site. By default publishing sites cache the pages for a minute. So most hit within that minute can be retrieved from cache, but once a minute the latest data is requested by the queries and parsed. This only shows that the cache mechanism of SharePoint really helps out with increasing your performance.

 

Categories: SharePoint

Creating new document via SharePoint 2013 shows Page not Found

Today I was working with SharePoint 2013 and I wanted to create a new Word document in a document library. When I clicked on the New Document button, I got a The webpage cannot be displayed error. The URL of the page was: ms-word:nft|u|http://[customerurl]/Documents/Forms/template.dotx|s|http://[customerurl]/Documents

The problem I had was that my client PC had Office 2010 AND SharePoint Designer 2013. When you have applications of Office 2013 next to Office 2013 on your machine, you might get this issue. This means both Office versions have Microsoft SharePoint Foundation Support installed. You can either remove this from one of the versions, or just install SP2 for Office 2010.

Categories: Environment, SharePoint

Display templates not reflecting ALL changes

April 10, 2014 4 comments

Lately I’ve been working a lot with custom display templates and custom refiners. This worked all pretty well, I could update the look and feel by changing the body without any problem. However, when I made changes in the head section of the display template, this didn’t seem to reflect my changes… And there you update your Managed Properties and in my case I also added some logic to retrieve some variables.

Using the debugger of Internet Explorer I saw the old version of the Javascript files were still used for the display templates. Somehow these where cached… The URL of my display template ended with ?ctag=122$$15.0.4481.1005. Well this number of course can change, but somewhere the old version was still in the SharePoint cache. This led me to look at my Result Types, since my display templates are used based on those…

On the top of the page I found the following message:

PropertySyncNeeded

Just click the Update link and you will get the following message:

PropertySyncDone

Refresh your search result page and the latest version of your display template will be used!

Ben Prins

What I want to remember about SharePoint

blog.frederique.harmsze.nl

my world of work and user experiences

Bram de Jager - Architect, Speaker, Author

Microsoft 365, SharePoint and Azure