Though search in Microsoft 365 SharePoint is good out-of-the box and you can do a full-text search and refine your results by “File type” and “Last modified”, but what if you want your content be tagged with your custom metadata (e.g. “Article category”), and you want to be able to refine your search results based on this metadata? I’d say it is possible and I’ll provide the solution below. The solution includes working with site term store (creating terms, term groups, term sets), configuring list/library columns and updating site search schema (mapping crawled properties to managed properties).
Category Archives: Software
PowerShell Script for Files Deduplication
If you think you have a lot of duplicated files that consumes your hard drive storage space, this article is for you. Personally, I have a lot of video and pictures on my working hard drive and on a backup HDD. While working with photos and videos I can rename files, copy or move them from folders to folders. As a result, I end up with gigabytes and terabytes occupied with duplicated files. So I need a tool to a) find duplicated files and b) remove duplications. I tried to find good scripts but somehow I was not happy with what I found, so I wrote scripts myself.
Surely you can buy/try a 3-rd party toot with GUI, but if you are comfortable with PowerShell – consider the following.
…
References
- PowerShell scrips at GitHub (use on your own risk)
Restricted SharePoint Search Deep Dive
Restricted SharePoint Search is a new Microsoft feature to mitigate sites oversharing issue when you are implementing Copilot. The feature is documented here, but still I have some questions, e.g.:
- How about external data? Copilot can use external data to learn from via agents and connectors. But would Restricted SharePoint Search if implemented allow data from external connectors to be used in copilot?
- “Users’ OneDrive files, chats, emails, calendars they have access to” – means own data for every single user or all shared OD data?
- What exactly is “Files from their frequently visited SharePoint sites”? I mean, how frequently user needs to visit site for this?
- What exactly means “Files that the users viewed, edited, or created.”
- What about teams chat messages, e-mails, viva engage messages?
- “Files that were shared directly with the users” – does that mean “individual files shared” or can include folders, libraries, sites?
- If user is a member of a teams – would all team content included?
- It says “Files…” but would site pages be included? Or list items? Or list items attachments? Pages is something that people use to create wiki to share knowledge.
- How long it takes for Microsoft 365 to start restricting results after Restricted SharePoint Search is enabled
- How to deal with “You do not have the required license to perform this operation”
Here I’m going to answer the questions above.
So far I build a test scenario using my dev tenant that includes multiple collaborated users and content in the form of files, pages, list items and messages spreaded across multiple sites falling into different categories of Restricted SharePoint Search allowed content.
You do not have the required license…
If you are getting “You do not have the required license to perform this operation” when you are trying Get-SPOTenantRestrictedSearchMode or Get-PnPTenantRestrictedSearchMode – that means there is no Copilot for Microsoft 365 licenses assigned to tenant yet. This feature – Restricted SharePoint Search – works only when at least one Copilot license is assigned to tenant.
… TBC
References
SharePoint Governance
Governance in IT is establishing rules, policies, tools and practices that helps you manage and protect your enterprise resources. SharePoint governance (or wider – Collaboration governance) covers
- resources ownership and lifecycle
- users’ access to resources
- compliance with your business standards
- security of your data
References
Granular Application Permissions to SharePoint
In 2021 Microsoft implemented “Sites.Selected” Graph API permissions to allow application access (without a signed in user) to specific sites (entire site only). In 2024 Microsoft implemented granular access – to specific list/libraries, as well as to specific documents/files and list items. Now name convention is *.SelectedOperations.Selected.
Permissions come in two flavors – delegated and application:
- Files.SelectedOperations.Selected – Allow the application to access a subset of files (files explicitly permissioned to the application). The specific files and the permissions granted will be configured in SharePoint Online or OneDrive.
- ListItems.SelectedOperations.Selected – Allow the application to access a subset of lists. The specific lists and the permissions granted will be configured in SharePoint Online.
- Lists.SelectedOperations.Selected – Allow the application to access a subset of lists. The specific lists and the permissions granted will be configured in SharePoint Online.
Restricted SharePoint Search rationale
Restricted SharePoint Search is a new (2024) Microsoft 365 feature that should help Copilot and general search results be more relevant, especially in large Microsoft 365 environments.
The problem background
When you have a really big number of sites – it is very difficult to keep them all in a well-managed state, e.g. to have reasonable (minimal) permissions provided to each site. So the typical situation (unfortunately) is: we have a lot of overshared sites. There are also a lot of ownerless sites where permissions are not managed. We know that search is security-trimmed, i.e. a user can get search results from content he/she already has access to. But with overshared sites – users get results they should not be able to see. With regular search experience – a user can see with his own eyes the source of the content he/she gets results from – so user can understand that results are coming from sites user should not have access to (overshared sites). But when it comes to AI-based search (Copilot) – user is getting answers, but he/she do not always know the source of that data.
So the problem is – we want to ensure Copilot is trained on a proper set of data and results are curated to users needs and access permissions. So for Copilot we really need to exclude from search scope such sites we are not sure content is valid, accurate and properly secured. We do not want users to get garbage or exposed sensitive information as an authoritative answer from Copilot.
The solution
This is where Restricted SharePoint Search feature should help, as with this feature your can restrict organization-wide search (and Copilot) to a curated list of SharePoint sites – “allowed sites” – public sites that passed attestation and where permissions are checked and data governance policies are applied, and content user work with on daily basis – his/her own documents and content shared with user directly (check details on Microsoft’s How does Restricted SharePoint Search work) – e.g. content user is supposed to have access to normally.
Excluded from search scope would be sites shared with user indirectly, e.g. something that was shared with everyone.
The root cause
Interesting, that with this feature Microsoft is not solving the real issue, but hiding (concealing) the real issue and just making Microsoft 365 to look more secure.
The real problem (root cause) is over-sharing data. But Microsoft already sold us SharePoint (and then Microsoft 365). And now Microsoft is trying to sell us Copilot, so they “solved” the over-sharing issue with “let us limit search” solution instead of “let’s fix oversharing”.
Note 1: Restricted SharePoint Search feature is free – i.e. it is included in standard Microsoft 365 license. Do not be confused with site access restriction policy – feature that require SharePoint Premium license and allows to restrict access to some SharePoint sites with specific groups only.
Note 2: I know that Microsoft is trying to address over-sharing issue as part of their SharePoint Premium (SharePoint Advanced Management) package, e.g. with AI Insights and Data access governance insights – reports that can help prevent oversharing by detecting sites that contain potentially overshared or sensitive content. With Manage content lifecycle we’d decrease amount of “garbage” or outdated content.
But SharePoint Advanced Management is licensed separately, when Restricted SharePoint Search is free.
Note 3: I know that users are an even more real problem because they tend to simplify and share information irresponsibly.
References
New Microsoft Graph Connector service plan
Microsoft Graph connectors allow your organization to index third-party data into Microsoft Graph. Microsoft Graph connectors enable Microsoft 365 Copilot better as it has more information relevant to your organization to answer prompts.
According to Microsoft, Microsoft 365 will soon include a new service plan, Graph Connectors Search with Index, offering a 50 million item index limit per tenant at no cost. Rollout starts September 2024.
Previously, to index third-party data into Microsoft Graph through Microsoft Graph connectors, you either needed to have a built-in entitlement through specific licenses (e.g., 500 items of index quota per Microsoft Copilot for Microsoft 365 license) or purchase add-on quota. With this change, the index quota per license entitlement is removed, as is add-on cost for additional quota. You now receive an entitlement of 50 million items for each tenant.
Each entity (or record) from the source data system that you add to Microsoft Graph can be considered an item which then shows up as a unique citation in Copilot’s responses, as a unique search result in Microsoft Search, etc. Depending on the type of data source, 1 item is –
- 1 document (word, excel, ppt, pdf, etc.) in file share
- 1 wiki page in Confluence
- 1 webpage in a website
- 1 ticket/issue in Jira
Total quota utilized is calculated in terms of total items stored in the index. Updates/changes to an item are not counted in any manner. There are no cost implications of updating an item multiple times. It still counts as 1 item only.
Applicable to subscriptions: Office 365 E1, Office 365 E3, Office 365 E5, Microsoft 365 E3, Microsoft 365 E5, Microsoft 365 F1, Microsoft 365 F3, Office 365 F3, Microsoft 365 Business Basic, Microsoft 365 Business Standard, Microsoft 365 Business Premium, Office 365 G1, Office 365 G3, Office 365 G5, Microsoft 365 G3, Microsoft 365 G5, Office 365 A3, Office 365 A5, Microsoft 365 A3, Microsoft 365 A5
Update SharePoint Site Title: GUI vs PowerShell
If you need to update a SharePoint site title (site name) programmatically (e.g. with PowerShell), and if this site is a group-based site (e.g. Microsoft Teams team site or Viva Engage community site or…) – you should not update SharePoint site title, but you should update group display name instead. Here is why.
In Microsoft 365 there is no sync from SharePoint site title to a group name. When you are updating SharePoint site title with GUI – you can see that new site title becomes new group/team name as well. So you might think that if you update SharePoint site title – Microsoft synchronizes it to connected group name. That’s not true. Actually when you are updating a group-based (e.g. teams-connected) SharePoint site title with GUI – Microsoft updates group first, then syncs updated group display name to SharePoint site name (title).
Here is the proof:
That’s a network trace I got with browser dev tools when I renamed site (updated site title) with GUI. So you can see the first API call is to update group, then group properties are synced back to site.
When we are updating a standalone site title – we are not seeing these calls.
So, if you need to update group-based site title programmatically – you must update group instead.
# does not work for group-based (e.g. Teams) sites:
Set-PnPTenantSite -Identity ... -Title "New Site Title"
# instead, you'd update group display name
Set-PnPMicrosoft365Group -Identity ... -DisplayName "New Display Name"
# and site title will be updated accordingly