Though search in Microsoft 365 SharePoint is good out-of-the box and you can do a full-text search and refine your results by “File type” and “Last modified”, but what if you want your content be tagged with your custom metadata (e.g. “Article category”), and you want to be able to refine your search results based on this metadata? I’d say it is possible and I’ll provide the solution below. The solution includes working with site term store (creating terms, term groups, term sets), configuring list/library columns and updating site search schema (mapping crawled properties to managed properties).
Tag Archives: Microsoft Search
Restricted SharePoint Search Deep Dive
Restricted SharePoint Search is a new Microsoft feature to mitigate sites oversharing issue when you are implementing Copilot. The feature is documented here, but still I have some questions, e.g.:
- How about external data? Copilot can use external data to learn from via agents and connectors. But would Restricted SharePoint Search if implemented allow data from external connectors to be used in copilot?
- “Users’ OneDrive files, chats, emails, calendars they have access to” – means own data for every single user or all shared OD data?
- What exactly is “Files from their frequently visited SharePoint sites”? I mean, how frequently user needs to visit site for this?
- What exactly means “Files that the users viewed, edited, or created.”
- What about teams chat messages, e-mails, viva engage messages?
- “Files that were shared directly with the users” – does that mean “individual files shared” or can include folders, libraries, sites?
- If user is a member of a teams – would all team content included?
- It says “Files…” but would site pages be included? Or list items? Or list items attachments? Pages is something that people use to create wiki to share knowledge.
- How long it takes for Microsoft 365 to start restricting results after Restricted SharePoint Search is enabled
- How to deal with “You do not have the required license to perform this operation”
Here I’m going to answer the questions above.
So far I build a test scenario using my dev tenant that includes multiple collaborated users and content in the form of files, pages, list items and messages spreaded across multiple sites falling into different categories of Restricted SharePoint Search allowed content.
You do not have the required license…
If you are getting “You do not have the required license to perform this operation” when you are trying Get-SPOTenantRestrictedSearchMode or Get-PnPTenantRestrictedSearchMode – that means there is no Copilot for Microsoft 365 licenses assigned to tenant yet. This feature – Restricted SharePoint Search – works only when at least one Copilot license is assigned to tenant.
… TBC
References
Restricted SharePoint Search rationale
Restricted SharePoint Search is a new (2024) Microsoft 365 feature that should help Copilot and general search results be more relevant, especially in large Microsoft 365 environments.
The problem background
When you have a really big number of sites – it is very difficult to keep them all in a well-managed state, e.g. to have reasonable (minimal) permissions provided to each site. So the typical situation (unfortunately) is: we have a lot of overshared sites. There are also a lot of ownerless sites where permissions are not managed. We know that search is security-trimmed, i.e. a user can get search results from content he/she already has access to. But with overshared sites – users get results they should not be able to see. With regular search experience – a user can see with his own eyes the source of the content he/she gets results from – so user can understand that results are coming from sites user should not have access to (overshared sites). But when it comes to AI-based search (Copilot) – user is getting answers, but he/she do not always know the source of that data.
So the problem is – we want to ensure Copilot is trained on a proper set of data and results are curated to users needs and access permissions. So for Copilot we really need to exclude from search scope such sites we are not sure content is valid, accurate and properly secured. We do not want users to get garbage or exposed sensitive information as an authoritative answer from Copilot.
The solution
This is where Restricted SharePoint Search feature should help, as with this feature your can restrict organization-wide search (and Copilot) to a curated list of SharePoint sites – “allowed sites” – public sites that passed attestation and where permissions are checked and data governance policies are applied, and content user work with on daily basis – his/her own documents and content shared with user directly (check details on Microsoft’s How does Restricted SharePoint Search work) – e.g. content user is supposed to have access to normally.
Excluded from search scope would be sites shared with user indirectly, e.g. something that was shared with everyone.
The root cause
Interesting, that with this feature Microsoft is not solving the real issue, but hiding (concealing) the real issue and just making Microsoft 365 to look more secure.
The real problem (root cause) is over-sharing data. But Microsoft already sold us SharePoint (and then Microsoft 365). And now Microsoft is trying to sell us Copilot, so they “solved” the over-sharing issue with “let us limit search” solution instead of “let’s fix oversharing”.
Note 1: Restricted SharePoint Search feature is free – i.e. it is included in standard Microsoft 365 license. Do not be confused with site access restriction policy – feature that require SharePoint Premium license and allows to restrict access to some SharePoint sites with specific groups only.
Note 2: I know that Microsoft is trying to address over-sharing issue as part of their SharePoint Premium (SharePoint Advanced Management) package, e.g. with AI Insights and Data access governance insights – reports that can help prevent oversharing by detecting sites that contain potentially overshared or sensitive content. With Manage content lifecycle we’d decrease amount of “garbage” or outdated content.
But SharePoint Advanced Management is licensed separately, when Restricted SharePoint Search is free.
Note 3: I know that users are an even more real problem because they tend to simplify and share information irresponsibly.
References
New Microsoft Graph Connector service plan
Microsoft Graph connectors allow your organization to index third-party data into Microsoft Graph. Microsoft Graph connectors enable Microsoft 365 Copilot better as it has more information relevant to your organization to answer prompts.
According to Microsoft, Microsoft 365 will soon include a new service plan, Graph Connectors Search with Index, offering a 50 million item index limit per tenant at no cost. Rollout starts September 2024.
Previously, to index third-party data into Microsoft Graph through Microsoft Graph connectors, you either needed to have a built-in entitlement through specific licenses (e.g., 500 items of index quota per Microsoft Copilot for Microsoft 365 license) or purchase add-on quota. With this change, the index quota per license entitlement is removed, as is add-on cost for additional quota. You now receive an entitlement of 50 million items for each tenant.
Each entity (or record) from the source data system that you add to Microsoft Graph can be considered an item which then shows up as a unique citation in Copilot’s responses, as a unique search result in Microsoft Search, etc. Depending on the type of data source, 1 item is –
- 1 document (word, excel, ppt, pdf, etc.) in file share
- 1 wiki page in Confluence
- 1 webpage in a website
- 1 ticket/issue in Jira
Total quota utilized is calculated in terms of total items stored in the index. Updates/changes to an item are not counted in any manner. There are no cost implications of updating an item multiple times. It still counts as 1 item only.
Applicable to subscriptions: Office 365 E1, Office 365 E3, Office 365 E5, Microsoft 365 E3, Microsoft 365 E5, Microsoft 365 F1, Microsoft 365 F3, Office 365 F3, Microsoft 365 Business Basic, Microsoft 365 Business Standard, Microsoft 365 Business Premium, Office 365 G1, Office 365 G3, Office 365 G5, Microsoft 365 G3, Microsoft 365 G5, Office 365 A3, Office 365 A5, Microsoft 365 A3, Microsoft 365 A5
Using Microsoft.Graph PowerShell to Search in Microsoft 365
There is a Microsoft.Graph PowerShell module provided by Microsoft which simplifies usage of Microsoft Graph API. Below is how to authenticate to MS Graph and how to search within SharePoint and Teams Microsoft 365 content using Microsoft.Graph PowerShell module.
Authentication
Interactive authentication code sample:
# Prerequisites
Get-Module Microsoft.Graph.Authentication -ListAvailable
Get-Module Microsoft.Graph.Search -ListAvailable
# Interactive Authentication
$clientid = 'd82858e0-ed99-424f-a00f-cef64125e49c'
$TenantId = '7ddc7314-9f01-45d5-b012-71665bb1c544'
Connect-MgGraph -ClientId $clientid -TenantId $TenantId
For daemon app authentication we need a certificate configured in Azure App and installed on the user machine. Daemon app authentication code sample (please specify your tenant id, app (client) id and certificate thumbprint:
# App Authentication
$TenantId = ""
$clientID = ""
$certThumbprint = ""
Connect-MgGraph -ClientId $clientid -TenantId $TenantId -CertificateThumbprint $certThumbprint
Search with Microsoft.Graph
# Search
$params = @{
requests = @(
@{
entityTypes = @(
"driveItem"
)
query = @{
queryString = "test*"
}
from = 0
size = 50
fields = @(
"title"
"description"
)
region = "NAM"
}
)
}
$res = Invoke-MgQuerySearch -Body $params
$res.HitsContainers[0].Hits
Note: when you are calling MS Graph Search API authenticated as user – you need to remove “region” parameter.
Code samples: https://github.com/VladilenK/m365-PowerShell/tree/main/KBA/Search
Search Microsoft 365 content programmatically: all articles index
Search in SharePoint using Microsoft Graph API with application credentials
Microsoft Graph API allows you to work with all the Microsoft 365 content – including search through Exchange e-mail messages, Yammer (Viva Engage) and Teams chat messages and surely OneDrive and SharePoint content (please refer to the original doc). Let me focus on searching in SharePoint Online and OD here but you can use the same technique to search through other Microsoft 365 services. I will use PowerShell but same ideas should work for other platforms/languages – Python, C#, node.js etc.
Assuming we have a registered Azure app configured correctly, including Secrets/Certificates blade and API permissions provided – we should be ready to authenticate and call Graph API unattended – on behalf of application itself.
Let us authenticate as a service/daemon app with client id and client secret:
# Authenticate to M365 as an unattended application
# specify your app id. app secret, tenant id:
$clientID = ""
$clientSc = ""
$TenantId = ""
# Construct URI and body needed for authentication
$uri = "https://login.microsoftonline.com/$tenantId/oauth2/v2.0/token"
$body = @{
client_id = $clientID
client_secret = $clientSc
scope = "https://graph.microsoft.com/.default"
grant_type = "client_credentials"
}
# Get OAuth 2.0 Token
$tokenRequest = Invoke-WebRequest -Method Post -Uri $uri -ContentType "application/x-www-form-urlencoded" -Body $body -UseBasicParsing
$token = ($tokenRequest.Content | ConvertFrom-Json).access_token
$headers = @{Authorization = "Bearer $token" }
Below is how I search Microsoft 365 content programmatically from PowerShell using MS Graph API being authenticates as user.
# Search
$entityTypes = "['driveItem','listItem','list','drive','site']"
$entityTypes = "['driveItem','listItem']"
$query = "LastModifiedTimeForRetention<2021-01-01"
$apiUrl = "https://graph.microsoft.com/beta/search/query"
$query = "test*"
$body = @"
{
"requests": [
{
"entityTypes": $entityTypes,
"query": {
"queryString": "$query"
},
"from" : 0,
"size" : 5,
"fields": ["WebUrl","lastModifiedBy","name" ],
"region": "NAM"
}
]
}
"@
$res = Invoke-RestMethod -Headers $Headers -Uri $apiUrl -Body $Body -Method Post -ContentType 'application/json'
$res.value[0].searchTerms
$res.value[0].hitsContainers[0].hits
$res.value[0].hitsContainers[0].hits.Count
$res.value[0].hitsContainers[0].moreResultsAvailable
Notice we use “region” – it is required to search with Graph API under application credentials. Otherwise you will get an error message “SearchRequest Invalid (Region is required when request with application permission.)”:
Parameter “fields” allows you to request only fields you need to be returned. As returning object will be smaller your request will perform faster.
There might be a big number of objects found in m365 upon your request. Graph will not always return to you all the results. AFAIK currently the limit is 500, so if there are more than 500 objects found – only first 500 will be returned. You can specify how many objects you need to be returned per call with “size” parameter.
You can check value of $res.value[0].hitsContainers[0].moreResultsAvailable property and if it’s True – that means there are more results. The value above and parameters “from” and “size” would allow you to organize a loop so you can call search API many times to return all results.
Other articles index:
Search m365 SharePoint and Teams content programmatically via MS Graph API
Using Microsoft Graph Search API as current user
Microsoft Graph API allows you to work with all the Microsoft 365 content – including search through Exchange e-mail messages, Yammer (Viva Engage) and Teams chat messages and surely OneDrive and SharePoint content (please refer to the MS’s original doc). After we got a registered Azure app configured correctly, including Authentication and API permissions provided (more on this) – we should be ready to authenticate and call Graph API on behalf of a user.
Let me focus on searching in SharePoint Online and OD here but you can use the same technique to search through other Microsoft 365 services. I will use PowerShell but same ideas should work for other platforms/languages – Python, C#, node.js etc.
Let us authenticate first. We’d need a MSAL.PS module for that.
# Ensure we have MSAL.PS module installed
Get-Module MSAL.PS -ListAvailable | ft name, Version, Path
# Install-Module MSAL.PS -Force -Scope CurrentUser -AcceptLicense
Import-Module MSAL.PS
# Authenticate to Microsoft Interactively
$clientid = 'd82858e0-ed99-424f-a00f-cef64125e49c' # your client id
$TenantId = '7ddc7314-9f01-45d5-b012-71665bb1c544' # your tenant id
$token = Get-MsalToken -TenantId $TenantId -ClientId $clientid -Interactive
$headers = @{Authorization = "Bearer $($token.AccessToken)" }
Below is how I search Microsoft 365 content programmatically from PowerShell using MS Graph API being authenticates as user:
# Search
# MS Graph Search API url (beta or v1.0):
$apiUrl = "https://graph.microsoft.com/beta/search/query"
# specify where to search - entity types
$entityTypes = "['driveItem','listItem','list','drive','site']"
$entityTypes = "['driveItem','listItem']"
# query
$query = "test*"
# build a simple request body
$body = @"
{
"requests": [
{
"entityTypes": $entityTypes,
"query": {
"queryString": "$query"
}
}
]
}
"@
# call Graph API:
$res = Invoke-RestMethod -Headers $Headers -Uri $apiUrl -Body $Body -Method Post -ContentType 'application/json'
# explore returned object
$res.value[0].searchTerms
$res.value[0].hitsContainers[0].hits
$res.value[0].hitsContainers[0].hits.Count
$res.value[0].hitsContainers[0].moreResultsAvailable
I used “beta” search API to research or make demos, but in production code youd stick with “v1.0”.
You can scope search down using entity types – ‘driveItem’,’listItem’,’list’,’drive’,’site’. “driveitem” here represents document library.
In query you can use KQL.
Always check if more results available with “$res.value[0].hitsContainers[0].moreResultsAvailable”. If there are more results and you need them – consider looping using paging technique.
References
Search M365 content from code: use-cases
Why do we need to implement search in our applications?
Use-cases for search on behalf of current user
Along with the usual ones – where you just need your app to search for some data and bring it to user – there is one different scenario I’d like to share:
You need to quickly detect content in SharePoint that is open for everyone
Brute force solution – getting detailed permissions report for all SharePoint sites might not be a feasible option, especially in large environments – it is a very resource-consuming task and might take days and weeks. So consider the following…
Since search is security-trimmed – a user can get only search results he/she already has access to; but what if we create an account and do not grant any SharePoint permissions or group memberships to this account, and then we’d search for everything on behalf of this account? That would mean that all what search returns represent content that is shared with everyone. There are some tricks and gotchas – here is the separate article on the same.
Use-cases for unattended search
What are the use-cases when you need to search in your daemon app or background job? Be aware that when you search on behalf of application credentials – search is NOT security-trimmed and your query would run against ALL SharePoint content… Here are some possible scenarios.
- Content detection/Investigation
- Let say you want some data is never shared with anyone and never appeared in search for anyone
- Or you might want to investigate what is the location some specific data is available at
- Imagine you are building sites classification system and
you use indexed custom site properties – so you are able to refine search results based on site metadata to get list of specific sites (adaptive scopes used in retention policy are based on the same mechanics) - Automation – let say you have a requirement to configure every tenant site in some ways – for instance – add some hosts to allowed domains to embed video or set some site properties based on who created the site or activate or deactivate some features and so on – how would you do that? You’d probably have a scheduled job that runs let say every hour against only new sites – sites created during that last hour. How would you get these recently created sites? Search with Graph API is the only nice solution today.
Index of other articles on the subject:
- Search Microsoft 365 content programmatically: Index
- Search Microsoft 365 content programmatically: Use-case scenarios
- Authentication to Microsoft Graph: Azure Registered Apps Certificates and Secrets
- Authorization to Microsoft Graph: Azure Registered Apps API permissions
- Calling Microsoft Graph Search API from code as current user
- Calling Microsoft Graph Search API from daemon/service app
- Using Microsoft.Graph PowerShell module to Search in Microsoft 365
- Using PnP.PowerShell module to Search in Microsoft 365